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Abstract 



Over the past decade, physicists have developed deep but non-rigorous techniques for studying phase 
transitions in discrete structures. Recently, their ideas have been harnessed to obtain improved rigorous 
results on the phase transitions in binary problems such as random fc-SAT or fc-NAESAT (e.g., Coja- 
Oghlan and Panagiotou: STOC 2013). However, these rigorous arguments, typically centered around 
the second moment method, do not extend easily to problems where there are more than two possible 
(f) ■ values per variable. The single most intensely studied example of such a problem is random graph k- 

coloring. Here we develop a novel approach to the second moment method in this problem. This new 
method, inspired by physics conjectures on the geometry of the set of fc-colorings, allows us to establish 
a substantially improved lower bound on the fc-colorability threshold. The new lower bound is within an 
additive 2 In 2+ot (1) ~ 1.39 of a simple first-moment upper bound and within 2 In 2 — l+o^ (1) w 0.39 
of the physics conjecture. By comparison, the best previous lower bound left a gap of about 2 + lnfc, 
unbounded in terms of the number of colors [Achlioptas, Naor: STOC 2004]. Furthermore, we prove 
that, in a precise sense, our lower bound marks the so-called condensation phase transition predicted on 
the basis of physics arguments [Krzkala et al.: PNAS 2007]. 
Key words: random structures, phase transitions, graph coloring. 
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£q '. 1 Introduction 

o: 

Let G(n,m) be the random graph on the vertex set V — {1, ...,n} with m edges. Unless specified 
_il ' otherwise, we assume that m = \dn/2~\ for a number d > that remains fixed as n — > oo and that 

f^S , k > 3 is an n-independent integer. We say that G(n, m) has a property £ with high probability ( 'w.h.p.') if 

rn ■ Hindoo P [G(n, m) e £} = 1. 



The theory of random graphs started with the famous 1960 article by Erdos and Renyi 11201 . In that paper 
they established the existence of a phase transition by proving the sudden emergence of a giant component 
S^ | at d ~ 1. They also set the agenda for future research by posing a number of intriguing questions. To date, 

all but one of these questions have been answered. The last one that remains openj concerns the chromatic 
number of G(n, m). More precisely, by now it is widely conjectured that there is a phase transition for 
fc-colorability for any fixed k > 3 [Q] . 

Over the years the random graph coloring problem has inspired the development of techniques, both 
analytic and algorithmic, that are by now widely applied in combinatorics, computer science and beyond. 
For instance, pioneering the use of martingale tail bounds in discrete mathematics, Shamir and Spencer J38l 
proved concentration bounds for the chromatic number. Their result was enhanced first by Luczak ll3T1 and 
then by Alon and Krivelevich [9], who used the Lovasz Local Lemma to prove that the chromatic number 
of G(n,m) is concentrated on two consecutive integers if m -C r?!" 1 . In a breakthrough contribution, 
Bollobas 11121 determined the asymptotic value of the chromatic number of dense random graphs (with 
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1 We owe this observation to Charilaos Efthymiou. 



m = f2(n 2 )). This result improved prior work by Matula [1321 . whose "merge-and-exposure" technique 
Luczak built upon to approximate the chromatic number of sparse random graphs 11301 . In addition, the 
random graph coloring problem has led to new algorithmic ideas. A prominent example is the spectral 
coloring algorithm of Alon and Kahale (8), which paved the way for tackling many other clustering and 
partitioning problems. 

With respect to the fc-colorability phase transition, the state of affairs is as follows. Achlioptas and 
Friedgut [1| showed that for any fixed k > 3 there exists a sharp threshold sequence dk- co \ = dk- C oi(n). 
This sequence is such that for any e > the random graph G(n, m) is fc-colorable w.h.p. if the average 
degree is less than (1 — e)dk- C oi(n), but there is no fc-coloring w.h.p. if it exceeds (l+£)dfc_ co i(n)a While 
this is a pure existence result, in a landmark paper Achlioptas and Naor [7] proved that 



*fe — col 



> 4,an =2(fc-l)ln(fc- 1) = 2fclnfc-21nfc-2 + o fc (l), (1) 



where the 0^(1) hides a term that tends to zero for large k. By comparison, a simple "first moment" 
calculation shows that 

<4-coi < dk first — 2fc In k - In k. (2) 

This leaves a gap of about 2 + In k, a function that diverges as k gets larger. 

Independently of the rigorous work, the random graph coloring problem has been studied in statistical 
physics under the snappy title of "diluted mean-field Potts antiferromagnet". In fact, over the past decade 
physicists have developed a deep but mathematically non-rigorous formalism called the "cavity method" 
for locating phase transitions in discrete structures Il33]|35l . According to the cavity method [29 36 37l l39l , 

d k - coi = 2k]nk-lnk-l + o k (l). (3) 

In addition, the cavity method has inspired new message passing algorithms called Belief/Survey Propaga- 
tion Guided Decimation 11311331 . 

Recently, there has been progress in verifying the physicists' predictions on the phase transitions in 
binary constraint satisfaction problems. For instance, the current gap between the best lower and upper 
bounds in random fc-SAT is w 0.19 (BJ. In random fc-NAESAT the gap is as tiny as 0(2~ fc ) mini- This 
leaves graph fc-coloring as the single most prominent example with a gap that is unbounded in terms of k. 

This large gap remains because the techniques of lfT4]|15l[T7l do not extend easily beyond binary prob- 
lems. More specifically, the presence of k possible colors (in physics jargon, 'spins') per vertex dramati- 
cally complicates the use of the second moment method, the mainstay for proving lower bounds. Indeed, 
the best previous result flT) on random graph coloring, based on the second moment method, is perhaps one 
of the most sophisticated contributions to the theory of random graphs. 

Here we develop a new approach to the second moment method in the presence of more than two 
spins. This approach, based on an analysis of the geometry of the set of fc-colorings and a local variations 
argument, is directly inspired by physics ideas. We view this technique as an important step towards the 
long-term goal of providing a rigorous foundation for the 'cavity method'. The new technique enables us 
to prove 

Theorem 1.1 The k-colorability threshold satisfies 

dk-col > dk.cond - Ofe(l), with d fc ,cond = 2fclnfc - lnfc - 21n2. (4) 

The gap between the new lower bound (|4|i and the elementary upper bound (0 is an additive 2 In 2+Ok ( 1 ) ~ 
1.39, rather than a function that grows with k. Moreover, the gap between (|4j and the physics prediction (0 
is a mere 2 In 2- 1 w 0.39. 

In fact, Theorem 11.11 determines the chromatic number of G(n, m) exactly for "most" average de- 
grees d. More precisely, let us say that a (measurable) set A c R>o has density a if limz^oo i J 1a = ot, 
where 1a is the indicator of A. 

Corollary 1.2 There exist a set A C R>o of density 1 and a function F : A -^ Z>o such that for all 
average degrees d € Awe have x(G(n, m)) = F(d) w.h.p. 



In order to prove that there actually is a sharp fc-colorability threshold one would have to show that rffc_ co i (n) converges. 



To be specific, A is the union of the intervals (dk-ifirst, rffc.cond — °fc(l)) where Theorem l 1.1 l and (0 show 
that G(n, m) is fc-colorable but not k — 1-colorable w.h.p. 

Corollary [L2] improves a result from Q, who used ([1} and (|2} to determine the chromatic number on 
a set A' of density | . Furthermore, Corollary ll.2l answers a question of Alon and Krivelevich whether the 
chromatic number of G(n, m) is concentrated on a single integer for most d "in an appropriately defined 
sense" (9) in the case m = 0(n)|j 

We expect that the new approach to the second moment method will apply to other problems with more 
than two possible values for each variable. Examples of this include various assignment or other coloring 
problems. Furthermore, and perhaps more importantly, we believe that our new second moment argument 
is going to be a necessary ingredient in the analysis of the physicists' message passing algorithms; we are 
going to comment on such potential applications in Section |3]below. 

Finally, why doesn't our second moment argument determine the threshold dk- co \ precisely? Accord- 
ing to the cavity method, the demise of the second moment method at <ifc,cond in Theorem 1 1.1 1 is due to 
a phase transition called condensation that marks a change in the geometry of the set of fc-colorings j28l 
[39). More precisely, for average degrees smaller than dk.cond — 0fc(l), the fc-colorings are arranged in 
well-separated "clusters", each containing only an exponentially small fraction of the total number of fc- 
colorings. As the average degree crosses dfc. CO nd + Ofc(l), this formation changes: the size of the largest 
cluster has the same order of magnitude as the total number of fc-colorings w.h.p. In effect, a bounded 
number of clusters dominate the entire set of fc-colorings. Hence the term "condensation". 

Based on our techniques we can indeed verify that, in a precise sense, such a phase transition occurs 
at dfc.cond (see Proposition ^. 2l below). But before we come to that we need to discuss the second moment 
method and its relationship to the physics predictions. 

2 Graph coloring and the second moment method 

Most of the current results on phase transitions in random constraint satisfaction problems are based on the 
second moment method. Suppose that Z = Z(G(n,m)) > is a random variable such that Z(G) > 
only if G is fc-colorable. Moreover, suppose that there is a number C = C(d, fc) > that may depend on 
the average degree d and the number of colors fc but not on n such that 

0<E[Z 2 ] <C-E[Z] 2 . (5) 



Then the Paley-Zygmund inequality 



z>\e [Z] 



> E[Z] 2 

-4EM (6) 



yields P [x(G(n,m)) < fc] > P [Z > 0] > (4C)" 1 > 0. Furthermore, the sharp threshold result flT| 
implies 

Lemma 2.1 Let d > and fc > 3 be such that lim inf P [x(G(n, m)) < fc] > 0. Then dfc- co i > d — o(l). 

n— y oo 

Thus, in order to obtain a lower bound on dfe- C oi. we need to find an appropriate random variable Z and 
verify (0. Both of these steps turn out to be non-trivial. 

2.1 Balanced colorings and the Birkhoff poly tope 

Perhaps the most obvious choice of random variable is the total number Zj.- co \ of fc-colorings of G(n, m). 
However, following J7) we are going to work with a particular type of colorings to simplify our calculations: 
we call a map a : V —> [fc] balanced if |cr _1 (j)| =n/fcforalH £ [fc] O Let B be the set of all such balanced 
maps and let Zfc.bai be the number of balanced fc-colorings of G(n, m). As it turns out, the second moment 



3 A proof that the threshold sequence dfe_ co l( n ) converges would imply a one-point concentration result for the chromatic number 
outside a countable set of average degrees. However, the known non-uniform sharp threshold result does not. 

4 We assume for the sake of presentation that n is divisible by k. Otherwise one requires \\a~ 1 (i)| —n/k\ < 1 instead. 



bound (0 does not hold for either Zk- QO \ or Zk,ba\ m the entire range < d < dk,cond- To remedy this 
problem, we need to understand its origin. Thus, let us sketch the approach taken in |7] in the following 
paragraphs. 

To get started, we compute the first moment. More precisely, since the first moment scales exponentially 
with n, we estimate its logarithm. By Stirling's formula the number of balanced o : V — > [k] is \B\ = 
jM-o(n) _ Furthermore, for any balanced a a random edge is bichromatic with probability 1— l/k+0(l/n). 
Since G(n, m) consists of m ~ dn/2 nearly independent random edges, we obtain 

-lnE[Z fcibaI ] - liifc + ^ln(l-l/fc). (7) 

n 2 

Working out the second moment is not quite so straightforward. Since E[Z| bal ] is nothing but the 
expected number of pairs of balanced fc-colorings, we need to compute the probability that two balanced 
a,r £ B simultaneously happen to be fc-colorings of G(n, m). Of course, this probability will depend on 
how "similar" a, r are. 

But how do we quantify similarity? In binary problems such as fc-SAT one could simply settle for the 
number of variables on which the two assignments coincide. However, for <t,t e B knowing the number 
of vertices that receive the same color is insufficient. For instance, r could be obtained from a simply by 
permuting the color classes, in which case u, t are identical as far as the fc-coloring problem goes without 
coloring a single vertex the same. Moreover, it is easy to construct examples where even applying the 
"obvious" permutation does not help. Therefore, we introduce the k x k overlap matrix p(a, r) whose 
entries 

Pij(^r) = ^\a- 1 (i)nr- 1 (j)\ 

represent the proportion of vertices with color i under a and color j under r. The need for this high- 
dimensional overlap parameter is the root of our troubles. 

The upshot is that p(a, r) contains all the information necessary to determine the probability that both 
cr, t are fc-colorings. In fact, let Z p ,bai be the number of pairs of balanced fc-colorings with overlap p. Then 



ilnE[Z p . bal ] ~ /( /9 )=l n fc_I 
n k 



k 



2 



k fc 2 ^ Pij 



(8) 



(We use the convention that In = 0. Jf| 

Let 1Z denote the set of all possible overlap matrices. Then E[Z| bal ] = X) P e7z E [Z p .b a i] • Further- 
more, because we confined ourselves to balanced fc-colorings, all the overlap matrices p <G 1Z are doubly- 
stochastic, i.e., all rows and columns sum to one. In fact, as n grows 1Z is dense in the set T> of all 
doubly stochastic fc x fc matrices, the Birkhoff poly tope . Hence, we can express the second moment as an 
optimization problem over T>, namely 

^lnE[^. bal ]~niax peI >/(p). (9) 

(Upon taking logarithms the sum X^gtj. E [Z p bai] turns into a max because the total number \1Z\ of sum- 

mands is easily bounded by n k , a polynomial in n.) 

Let p = ^1 be the matrix with all entries equal to 4, the barycenterof the Birkhoff poly tope. A glimpse 
at Q reveals that f(p) ~ — InE [Zk^bai], which corresponds to the square of the first moment. Therefore, a 
necessary condition for the success of the second moment method is that the maximum (O is attained at p. 
Indeed, if f(p) > f(p) for some peP, then E[Z| d ] exceeds E [Zfc. goo d] by an exponential factor, 
because ^ is on a logarithmic scale. 

This necessary condition turns out to be sufficient, i.e., the second moment method succeeds iff the 
dominant contribution to (0 comes from p. Combinatorially, this means that pairs a, t that, judging by 
their overlap, look completely uncorrelated make up the lion's share of E[Z 2 bal ]. 



5 Equation {8) follows because by inclusion/exclusion a single random edge is bichormatic under both a, t with probability 
1 - T + ttIZF,--i P?, ■+0{l/n). Moreover, the number of pairs (cr,r) with overlap p is fc n ^°( n ' ( „ n ™)(cf.(3). 

fc K '•iJ -L '■J V Pll i. viPfefe I. 



2.2 A first attempt: the singly-stochastic bound 

Unfortunately, solving (|9]i proves seriously difficult. Achlioptas and Naor resort to a relaxation: letting S 
denote the set of all fc x k singly stochastic matrices (with all row sums equal to one but no constraints on the 
column sums), they study max pe s f(p)- This optimization problem turns out to be much more amenable. 
In fact, while in (O all matrix entries are tied together by the constraint that p be doubly stochastic, in 
maxpgs f(p) the constraints are confined to single rows. Thus, max pe s /(p) decomposes into k separate 
optimization problems, each over a fc-dimensional simplex. 

Yet even solving this relaxation is quite non-trivial. Achlioptas and Naor perform a sophisticated 
"global" analysis based on chasing the zeros of the differentials of certain functions related to /, the signs 
of the second differentials at these points, etc. (up to the sixth derivative). They manage to solve the relaxed 
problem completely. The result is that its maximum and thus that of (0 is attained at the doubly-stochastic 
p for d < dk,AN, about an additive In k below dk- C o\ (cf. ©)■ 

But for larger densities the maximum of / (p) over singly-stochastic p is attained at a matrix that fails to 
be doubly-stochastic. Indeed, the maximizer is very close to the matrix phaif whose first k/2 rows coincide 
with those of the identity matrix id (with ones on the diagonal and zeros elsewhere) and whose last k/2 
rows have all entries equal to 1/fc. Of course, phaif fails to be doubly-stochastic. Hence, one might hope 
that p remains the maximizer of (|9]l for d up to dfc. CO nd- That is, however, not the case. Indeed, consider the 
doubly-stochastic 

Pstabic = (l-l/fc)id + fc- 2 l, (10) 

where 1 denotes the matrix with all entries equal to one. A simple calculation reveals that /(p s tabic) > 
/(p), and thus that the second moment argument for Zk,h&\ fails, for d strictly below <ifc,cond- 

2.3 The new approach 

Thus, to prove Theorem ll.il we need to work with a different random variable. The key observation behind 
its definition is that the second moment ((9]) is driven up by certain "pathological" fc-colorings a. Their 
number behaves like a lottery: while the random graph typically has few such colorings, a tiny fraction 
of graphs have an abundance, boosting the second moment. To exclude these pathological cases, we 
define a notion of "good" colorings. This induces a decomposition Z^hai = Zk, goo d + -Zfc.bad such that 
E [Z/;,g 00 d] ~ E [Zfcbai]- The second moment bound (f5]) holds for Zk, goo d so long as d < dk. CO nd — 0^(1). 
The notion of "good" is inspired by statistical physics predictions on the geometry of the set of fc- 
colorings. More precisely, according to the cavity method ll28l[39l . for (1 + Ofc(l))fclnfc < d < dk.cond 
the set of all fc-colorings, viewed as a subset of [k] n , decomposes into tiny "clusters" that are well-separated 
from each other. To formalize this, we define the cluster of a balanced fc-coloring o of G(n, m) as the set 

£( CT ) = l T G W\ l '■ T i s a balanced fc-coloring and pu (er, r) > 0.51 for all % € [fc]} . (11) 

In words, C(er) contains all balanced fc-colorings r in which more than 51% of the vertices in each color 
class of a retain their color. The definition of "good" imposes constraints on the cluster size and separation. 
Computing the second moment of Zk, goo d boils down to an optimization problem as well. However, in 
comparison to (0, this problem is over a significantly reduced domain P g ood C T>, reflecting the physics 
predictions on the clustered geometry of fc-colorings: 

ilnE[Zg <good ]~ max f(p). (12) 

Thus, instead of relaxing (O as in Q, our approach is to add constraints to the problem. In particular, 
Pstabie ^ 2?good- Furthermore, to solve the maximization problem (Q~2}, we pursue a novel approach: 
instead of performing a global analysis as in JT), we use an argument based on local variations, somewhat 
reminiscent of a gradient method in mathematical programming. Sections|4]and|5]fill in the details. 

2.4 The condensation transition 

Finally, why does the second moment method fail beyond dfc, con d? According to the (again, non-rigorous) 
physics predictions, as d increases up to dk,cond, both the total number Zk- C o\ of fc-colorings and the 



cluster sizes decrease. However, Zk- C o\ drops at a faster rate, and at dfc ;C ond + °fe(l) the s i ze °f the largest 
cluster C(a) has the same order of magnitude as the total number of fc-colorings w.h.p. In effect, abounded 
number of clusters dominate the entire set of fc-colorings. 

This prediction explains the demise of the second moment method at dfc.cond- Indeed, as we saw above, 
the second moment method succeeds iff two random colorings a, r of G(n, m) "look uncorrected" in the 
sense that their overlap is p w.h.p. Once there is condensation, this type of decorrelation does no long occur 
because a, r belong to the same cluster (and thus are highly correlated) with a non-vanishing probability. 

But can we prove the existence of a "phase transition" at dfc, C ond in any sense? The second moment 
argument enables us to trace both the cluster size and the number of fc-colorings for d < dk- co \ — Ofc(l). 
If one extrapolates these formulas to larger d, one finds that the formula for the cluster size exceeds the 
extrapolation of the total number of fc-colorings by an exponential factor! Of course, in actuality Zk- co \ 
cannot possibly be less than the size of a single cluster. Thus, under an appropriate scaling the limiting be- 
havior of Zk-co\ and/or the cluster size has to yield at dfe jC ond- Indeed, in physics jargon a phase transition 
is a point do where the function 

v (d) = Km ml'iu d3) 

n— >oo 

is non-analytic|j We believe this to occur at dfc.cond 4-0^.(1). However, the limit (fT3l is not currently known 
to exist for all d. Therefore, we have to phrase the following result with a bit of care|j 

Proposition 2.2 There is eu = 0fc(l) such that the following is true. 

1. The limit ip{d) exists and is analytic for all d < dk,cond — £k- Indeed, (fi(d) — fc(l — l/k) d ' 2 . 

2. By contrast, either ip{d) does not exist for some d £ (<ifc, C ond — £fc, dk,cond + £fc) or, if it exists for all 
such d, the limiting function (p(d) is non-analytic at some point in this interval. 

While ( fT3l is not known to exists for all d, Bayati, Gamarnik and Tetali ifTTI proved the existence of a 
closely related limit, the so-called "free energy". Emboldened by their result, we pose 

Conjecture 2.3 For any fc > 3 and any d > the limit ( 1731 ) exists. 

3 Related work 

Due to the Ofc(l) error term in (|4), Theorem ll.ll does not yield improved bounds on <ifc_ co i for small values 
of fc. For instance, the best current bound on the threshold for 3-colorability remains 4.03 [4|. This bound is 
constructive. It is obtained by tracing a certain linear time algorithm via the differential equations method. 
While we have not attempted to optimize the error term in Theorem ll.il it would be interesting to see if 
our techniques render better results for, say, fc = 3, 4, 5 as well. 

For general fc the computational problem of finding a fc-coloring of G(n,p) in polynomial time is 
a long-standing challenge, mentioned prominently in several influential survey articles (e.g., lETI |261 ). 
Simple greedy algorithms find a fc-coloring in linear time for d < fclnfc ~ ^dk- co i w.h.p. Il3l l23ll27l . 
about half the fc-colorability threshold. However, no polynomial time algorithm is known to beat the, in 
the words of Shamir and Spencer ll38l . "most vexing" factor of two. In fact, it has been suggested that the 
appearance of "frozen variables" at d ~ ^dfc- C oi causes the demise of local-search based algorithms J2ll34l. 

Experiments are inconclusive as to whether the physicists' new message passing algorithms do |[T3l 
1331 . Thus, analyzing them rigorously is an important challenge. We believe that a plausible approach to 
this problem is to apply the second moment method developed in this paper to intermediate steps of the 
algorithm where a number of vertices have already been assigned a color. The additional (substantial) 
challenge is that in this setup the symmetry amongst colors is broken due to the previous decisions of the 
algorithm. 



6 We use the term "analytic" in the sense of complex analysis (i.e., the function admits an expansion into a power series with a 
positive radius of absolute convergence). The physics tradition is to actually consider linin-joo — E[ln Z] (cf. 1351 ). We work with 
the nth root instead as Z = Z^_ co \ may be zero. 

7 Proposition ^. 2l is proven in Appendix IF1 



The techniques of Achlioptas and Naor [7| have been used to prove several further important results. 
For instance, Achlioptas and Moore identified three (and for some d just two) consecutive integers on 
which the chromatic number of the random d-regular is concentrated. This was reduced to two integers for 
all fixed of d (and one for about half of all d) by adding in the small subgraph conditioning technique J25l . 
We expect that our techniques can be combined with small subgraph conditioning as well to get improved 
results for random d-regular graphs. 

Both [7] and our Theorem 1 1.1 1 deal with the case that the average degree d remains fixed as n —> oo. 
In lfl6l the second moment method from Q was combined with the concentration argument from |9] to de- 
termine three (and in some cases two) integers on which the chromatic number of G(n, m) is concentrated 
for m -C n 5 / 4 . We expect that the present techniques allow for an improvement. 

Recently Dyer, Frieze and Greenhill |[T9l generalized the second moment argument from [7] to the 
problem of fc-coloring j -uniform random hypergraphs (with average degree d fixed as n — > oo and k, j > 3 
fixed as well). As in J7], a key step in their proof is to relax an optimization problem over doubly-stochastic 
matrices to the singly-stochastic case. Thus, it would be interesting to see if the present techniques allow 
for improved results in the hypergraph case. 

Dani, Moore and Olson [ 1 8 1 studied a "decorated" coloring problem in which each pair of (u, v) of ver- 
tices comes with a permutation ir u , v of the fc possible colors. These permutations are chosen independently 
and uniformly at random for each edge. This leads to a notion of decorated fc-colorings that involves the 
permutations on the edges. They conjecture that the threshold for fc-colorability in the "decorated" problem 
coincides with the common dk-coh It might be interesting to see if our approach yields better bounds for 
the decorated fc-coloring problem, possibly matching its condensation transition. 

The use of the second moment method in random constraint satisfaction problems was pioneered by 
Achlioptas and Moore (6) and Frieze and Wormald ll22l . who dealt with random fc-SAT Recently improved 
results on binary random constraint satisfaction problems have been obtained via enhanced second moment 
arguments lfT4l [T51 [T7l . As mentioned earlier, the crucial difference between the previous and the present 
work is that we deal with a problem in which each "variable" (i.e., vertex) has more than two "spins" 
(colors) to choose from. That said, we harness the idea, first suggested in ifTTl . of combining the second 
moment method with physics predictions on the geometry of the solution space. To study these geometric 
properties we build upon and extend techniques from |f2]- 

4 The random variable 

The goal in this section is to define the random variable Zfc.good on which our second moment argument 
is based and to compute its expectation. At the expense of the Ofc(l) error term in (|4} we may assume 
throughout that fc > fco f° r a big constant ko- We may also assume that n is sufficiently large. 

The definition of Zk, goo d is guided by the statistical mechanics predictions on the geometry of the set 
of fc-colorings, according to which for densities (1 + Ofe(l))fc In fc < d < dk.cond the fc-colorings come in 
well-separated clusters; recall the formal definition (Q~T} of the "cluster" C(a). 

To formalize the concept of "well-separated", we call a balanced fc-coloring a separable if for any other 
balanced fc-coloring r and any i,j <G [fc] such that pij(a,r) > 0.51 we indeed have pij{a,r) > 1 — k, 
where k = In' fc/fc = Ofc(l). In other words, the overlap matrix p(a, r) does not have entries in the interval 
(0.51, 1 — k). This definition ensures that the clusters of two separable colorings a, r are either disjoint or 
identical (to see this, apply the condition to the diagonal entries pu(a, r)). 

Furthermore, according to the physics calculations each cluster only contains a small fraction of all 
balanced fc-colorings w.h.p. Since their total number does not exceed the expectation E [Zfc.bai] much 
w.h.p. (by Markov's inequality), we definitely expect that each cluster has size at most E [Zfc.bai] w.h.p. 
These considerations lead us to 

Definition 4.1 A balanced k-coloring a is good if it is separable and \C(a)\ < E [Z^bai]- 

Let Zfc igoo d be the number of good fc-colorings. A key fact is that for d < dk.cond the expectation of 
-Zfc.good coincides with the expectation of Zfe_ co i, the total number of fc-colorings, up to a sub-exponential 
factor. Hence, we merely rule out a (for our purposes) negligible fraction of "bad" colorings. 



Proposition 4.2 For dk,AN < d < dk, c <md — Ofe(l) we have 

- InE [Z fc , good ] ~ - kiE [Z fc ] ~ lnfc + - ln(l - 1/fc) > 0. (14) 

n n 2 

The notion of "good" turns out to be sufficient to ensure the success of the second moment method. 
More precisely, the core of this work is to establish 

Proposition 4.3 There is C = C(k) > such that 

E [^fc, g ood] < G • E [Z k:good ] for all d fciAN < d < d k , CO nd - Ofc(l). 

Propositions 14.21 and 14.31 together with Lemma 12.11 imply Theorem 11.11 We are going to sketch the 
second moment argument in Section [5] But before we come to that, we deal with the first moment. 

Proving Proposition \4.2\ We compute the first moment by way of the "planted model". Let A be the set 
of all pairs (G, a) such that G is a graph on V = [a] with m edges and a is a balanced fc-coloring of G. 

Moreover, let A gooc i be the set of all (G, a) G A such that a is a good fc-coloring of G Letting N = (*■*') 
equal the total number of graphs with m edges, we see that 

E [Z kMl ] = |A| /N, E [Z fe . good ] = |A good | /N. 

Since we already know the expectation of Zfc.bai (from (0), we just need to show |A goo d| ~ |A|. 
The planted distribution provides a simple way to draw a pair (G, cr) G A uniformly at random: 

PI. First, draw a balanced map a : V — >• [fc] uniformly at random. 

P2. Then, draw a graph G with m edges that are bichromatic under cr uniformly at random. 

This experiment induces the uniform distribution on A because each balanced a is a proper fc-coloring for 
an equal number of graphs. (This is not generally true for non -balanced colorings.) 

Hence, to show that |A goo( j| ~ |A| it suffices to verify that (G,cr) G A goo d w.h.p. Expansion arguments 
show that w.h.p. cr is separable in G. Furthermore, with respect to the cluster size we find 

Lemma 4.4 Suppose that d fc ,AN < d < 2fcmfc. Then i In \C(a)\ = (1 + o k (l)) lj ^ w.h.p. 

The proof of Lemma l4~4l is fairly intricate. It draws on techniques developed in |2|. Roughly speaking, 
we establish that w.h.p. G is dominated by a core G comprising of vertices that each have at least, say, 
100 neighbors in G of each color other than their own. Due to expansion properties, no vertex in G can be 
recolored without leaving the cluster C(cr). Furthermore, w.h.p. most vertices v g" G that have at least one 
neighbor in each color class other than their own are "attached" to the core. This means that switching the 
color of v necessitates recoloring a vertex in G, which is impossible inside C(<x). 

Thus, the volume of the cluster stems from vertices v that fail to have a neighbor of some color i ^ 
cr(v). Standard calculations show that there are about ? such v w.h.p., and that for most of them there is 
only one "free" color i j^ <r(v). Hence, v has two colors to choose from. These choices turn out to be more 
or less independent for all v. In effect, the cluster size is 2^ 1+ ° k ^ n / k w.h.p., which is less than E[Zfebai] 

ford < rffc, co nd -Ofc(l). 

5 The second moment 

As outlined in Section|2] the "vanilla" second moment argument is based on optimizing the function f(p) 
over the entire set T> of doubly-stochastic matrices. But the notion of "good" colorings enables us to restrict 
the domain over which we need to optimize significantly. More precisely, let us call pgD separable if for 
any i, j € [fc] such that p,y > 0.51 we have /Oy > 1 — k (with k = In fc/fc). Furthermore, we say that p is 
s-stable if there are precisely s pairs (i, j) £ [fc] x [fc] such that pij > 1 — k. Clearly, any doubly-stochastic 
matrix is s-stable for some < s < fc, and in each row (and column) at most one entry is > 1 — k. Let 

2? g ood = {p G 2? : p is separable and s-stable for some < s < k — 1} . 



In other words, £> g ood consists of all p G T> with at most k — 1 entries that are at least 1 — k, while all other 
entries are at most 0.51. In particular, £> g0 od does not contain fc-stable matrices such as p s tabic from (TTOb . 

Geometrically, the set 2? goo d is obtained from the Birkhoff polytope T> by cutting out "cylinders" 
consisting of matrices with an entry in (0.51, 1 — k). In effect, T> gooc \ is a disconnected set. It decomposes 
into the sets £> Sigc ,od of s-stable p G T> good for < 5 < k — 1. 

These sets T> sgood can be interpreted nicely in terms of the faces of the Birkhoff polytope. More 
precisely, as all p G £> Sjgoo d are s-stable there are precisely s entries p^ such that />y > 1 — k. By 
permuting the rows and columns suitably, we may assume that pu > 1 — K for i = 1, . . . , s. Thus, p is 
close to the k — s-dimensional face of the Birkhoff polytope where the first s diagonal entries are equal 
to one. Furthermore, since all other entries of p are < 0.51 (because p is separable), p is in fact close to 
a point "deep inside" this face. In fact, we are going to show that the maximum of /(/?) over 2? s , g ood is 
attained at a point very close to the barycenter of the face. The result of this analysis is 

Proposition 5.1 For any < s < k — 1 we have max p6 x5 s ood /(p) < f(p)i with equality only for s = 0. 

Before we come to the proof of Proposition l5.ll let us indicate how it implies the second moment bound. 

Proof of Proposition \4.3\ (assuming Proposition \5.1\) . Let Z s be the number of pairs (cr, t) of good fc- 
colorings whose overlap matrix is s-stable. Then (by Cauchy-Schwarz) 

r k -, 2 k 



^fc^ood 



s=0 



]Tz s <(fc + i)]Tz s 2 . (15) 



s=0 



By construction, the overlap matrix of any two good colorings is separable. Hence, Proposition 15 . 1 1 yields 

1 fc_1 2 

- In V E [Z 2 ] ~ max f(p) = f(p) ~ - InE [Z fc , good ] . (16) 

s—0 

With a bit of calculus ("Laplace method") we rid ( TToT l of the logarithms to find C = C'(fc) > such that 

fc-i 

Y, E i Z s] <C'-E[Z k:good } 2 . (17) 

s=0 

Finally, let a, r be two good colorings such that p(a, r) is fc-stable. A suitable permutation of the color 
classes of r yields a good f <G C(a). Since all good fc-colorings satisfy |C(cr)| < E [Zfc. goo d], we obtain 



E [Zt] < E 



(7 good 



fc!|C(a)| 



< E [fc! • E [Z k , good ] ■ Z k , good ] < fc! • E [Z fc , good ] 2 . (18) 



Combining O, (Q3 and ([T3, we find that E[Z 2 good ] < C ■ E [Z k , good ] 2 for some C = C(fc) > 0. □ 

Proving Proposition 15. 1 J The basic idea behind the proof of Proposition 15. H is to show that the function 
f(p) is maximised on each "splinter" £> s , g ood by a particular matrix p s whose value f(p s ) can be estimated 
easily. Roughly speaking, the idea is to show that for each p G T> sgood there is a path from p to p s along 
which the function value increases monotonically. Although this path may leave the Birkhoff polytope 
temporarily, the target matrices p s are in T> good . 

The matrices p s are "candidate local maxima" of a particularly simple form. The top-left sx s block of 
p s is of the form (1 — a)id + /?l. The bottom (fc — s) x (fc — s) square is CI. The off-diagonal s x (fc— s) and 
(k — s) x s blocks are of the form 7-1. Clearly, a, (3, 7, £ must be chosen so that p s is doubly-stochastic, 
i.e., 1 — a + s ■ j3 + (k — s)7 = s • 7 + (fc — s)£ = 1, and, of course, a, /3, 7, £ > 0. Furthermore, to 
ensure that p s G 2? s ,good we need that a — j3 < k. We let p s be the matrix that maximizes / subject to 
these constraints. The values f(p s ) turn out to be negative for intermediate vfc < s < fc, and the overall 
maximum lies at s = 0. Note that /io = p. 

In fact, the parameters j3, 7 in the definition of p s tend to rapidly as fc gets larger. In effect, p s 
is close to the doubly-stochastic matrix p s whose top-left s x s block is the identity matrix and whose 



bottom-right (fc — s) x (k — s) block is the flat matrix (k — s) ~ 1 1 . This matrix p s is the barycenter of the 
k — s-dimensional face of T> defined by the equations pa = 1 for i = l,...,s. 

We are going to demonstrate the maximization of f(p) over X> Sj gooc j in two cases. First, for s = 0, 
where the overall maximum is attained; this turns out to be the simplest case technically. 

Proposition 5.2 For any stochastic matrix p such that maxij pij < 0.51 we have f(p) < f(p). 

In addition, we deal with one somewhat more intricate case. 

Proposition 5.3 Suppose that 1 < s < k°". Then for any s-stable p £ 2? g0 od we have f(p) < f(p)- 

Proof of Proposition I5.2J We are going to argue that we can increase the function value by making the rows 
"flatter", eventually replacing each of them by the vector with all entries equal to 1/k. Indeed, suppose that 
row i is not "flat", i.e., there exist j, I such that py < pa. A straight computation shows that in the extreme 
case p^ =0we have f(p) < /((l — e)p + ep) for a small enough e > 0. In other words, the maximum of 
/ does not occur on the boundary. Hence, we may assume that p^ > 0. If we increase py slightly at the 

expense of pu, what will happen to the function value? Let \\p\\ 2 = [ J2 a &=i Pab] 
Lemma 5.4 Suppose that p is stochastic and that < p^ = min ge r fc i pi q < pu < 0.49. Then 

df df 



dpij dp t i 



>0. 



Proof. A direct computation shows that 

df 1 + In p^ d ■ p^ 



dpij k k 2 - 2k+\\pf 2 

Hence, 

df df _ 1 ^ f Pa_\ , d ■ {p^ - pu] 



2 ' 



dpij dpu k \PijJ k 2 - 2fc+ ||pn 2 
Taking exponentials, we find that 

. f df df \ . \ Pa -pij ( d-(pu-pij) 

sign <^ -— \ = sign <^ 1 + J - - exp ; ; n9 } (19) 

[ dpij dpu J p^ \k-2 + k- 




(with the convention that sign(z) = ±1 if z is positive/negative, and sign(0) = 0). Thus, we need to figure 

°*3 



out where the linear function z n- 1 + z/pij intersects the exponential z i-> exp[z ■ d/(k — 2 + \\p\\ 2 /k)} 



We claim that 

d ■ z 





for0<2<0.49. (20) 

Indeed, by convexity, the line and the exponential function intersect in at most one point z* > 0, and for 
< z < z* the linear function is greater. Therefore, it suffices to verify that ( f20b holds at z = 0.49. On 
the one hand, because d < 2k In k, we have 

/ 0.49d \ / 0.98-fclnfc \ . 99 

eXP { k -2 + k-i\\pf 2 )- eXP [ h-2 J-" ' 

provided that k is not too small. On the other hand, because /Oy is the smallest entry in row i and p is 
stochastic, we have py < 1/k and thus 1 + z^/pij > 0.49fc > k°". □ 

Corollary 5.5 Suppose that p is stochastic and that < pij = min^fe] pi q < pu = max g6 [M pi q < 0.49. 
Let p be the matrix obtained from p by replacing the ith row by jl. Then f(p) < f(p). 

10 



Proof. Let Q be the set of all stochastic matrices p that coincide with p outside row i, and that satisfy 
maxqgrfci pi q < 0.49. Then Q is a compact set and thus / attains a maximum on Q. Assume for contradic- 
tion that the maximum is attained at p itself. Since pa < 0.49, we clearly have < z = pu — Pij < 0.49. 
Hence, Lemma 15.41 and $1% show that increasing p^ by a tiny e > and decreasing pu by the same e 
yields a stochastic matrix p G Q with a strictly greater function value. Since this argument applies when- 
ever there are two distinct entries in row i, the maximum of / on Q is attained strictly at the matrix p where 
all entries in row i are equal. □ 

Geometrically, the proof of Corollary I5.5l can be viewed as showing that there is a path from p to p 
along which the function value increases. We use a similar argument to show 

Corollary 5.6 Suppose that p is stochastic and that 0.49 < pu = max g6 [(.] Piq < 0.51. Let p be the 
matrix obtained from p by replacing the ith row by jrl. Then f(p) < f(p}- 

Proof We may assume without loss that i = 1 and p\\ > • • • > p\u > 0. There are two cases to consider, 
depending on the value of pi2- The first case is that pi2 < 0.49. Let p be the matrix obtained from p 
by replacing each of pi2, . . . , p\k by 1 ^f " ■ Using (H~9b as in the proof of Corollary 15.51 we find that 
f(p) < f(f>)- Furthermore, direct calculations yield 

„,_, „,„, ln2-0.491n/c „._, „, A , 0.27 In k 
H(p)-H(p) < , E(p)-E(p)< . 

In particular, f(p) < f(p) < f(p). An analogous argument applies in the case pi2 > 0.49. □ 

Corollaries 15 . 5 1 and 15 .6 1 allow us to "flatten" the rows of a matrix p as in Proposition 15. 2l one by one. 
Ultimately, this yields the desired bound f(p) < f(p), and thus Proposition l5.2l 

Proof of Proposition \5.3l Somewhat more delicate arguments are necessary to deal with p € £>s, g ood 
for 1 < s < k — 1. By permuting the rows and columns suitably, we may assume that pu > 1 — k 
for i = 1, . . . , s and that all other entries are less than 0.51. Think of the matrix p as consisting of four 
blocks: the upper-left sx s matrix, the off-diagonal s x [k — s) and [k — s) X s blocks, and the bottom-right 
(k — s) x (k — s) matrix. Roughly speaking, to estimate f(p) we apply local variations arguments combined 
with estimates of their contributions to / to each of these four blocks. 

We outline the proof of Proposition 15. 3 I to demonstrate this approach. Thus, suppose that p £ 2? s , g0 od 
for some 1 < s < fc°". Assume that pu > 1 — n for 1 < i < s. We are going to compare f(p) with 
/(As)> where jj, s is the doubly-stochastic matrix whose top-left sx s block is the identity matrix and whose 
bottom-right (k — s) x (k — s) is tz~l- A direct calculation yields 

f(M < Hp) - l - ± ^ 1 - (2D 



Local variations arguments akin to those in the proofs of Lemma 15.41 and Corollaries 15.5145.61 yield the 
following estimate. 



Lemma 5.7 Let p be the stochastic matrix with entries 

pij -- 



Pij ifi G [k],j < s, 

khT,i >8 Pil ifie[k],j>s. 

Then f(p) < f(p). 

We are going to compare f(p) with f(ft 8 ). Because p is doubly stochastic, we have 

s s s s 

ks >EE^ = EE pa = EE^ = EE fa- < 22 > 

2 — 1 j>S 2 — 1 j>s i>s j — 1 i>s J — 1 

Let 

_ f 1 - p u for i G [s] , 

q% ~ I EJ=1 Pij for i > s - 
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Because p is a stochastic matrix, we can view H(pi) = — Ylj=i Pij m P«i as the entropy of the probability 
distribution (f>ij)je[k] on [k]. Since the uniform probability distribution maximizes the entropy and as 
Pa > 1 — K f or i S [s], we find 

H(pi) < -(1 - ft) ln(l - ft) + g, ln(fe/ft) < -(1 - k) ln(l - k) + n\n(k/ K ) for i € [«]. (23) 

Similar estimates show that 

W(/3i) < (l-ft)ln-— -+ftln(s/%) + (!-%) Mfc-s) 

< -(l-ft)ln(l-ft)+ftln(s/%)+M^-«) fori>s. (24) 

Further, because the function z n- — z ln(z) — (1 — z) ln(l — z) is concave, we obtain from (1221 and (l24T > 

i X! H (^) - ^ir ln(fc ~ s) ~ (1 ~ Ks/k ^ ln(1 ~ Ks/fc) + t ln ^/ K )- (25) 



Combining d23l and d25T l and simplifying yields 



1 J° , fc — s 
In fe — — >^ pij In pij < In fc H — ln(fc — s) + o(l/fc). 



jfc ^ ™-™ " ' k 



2 



(26) 



Because p is stochastic we have ||pi|| 2 < 1 for z = 1, . . . , s, and by construction p satisfies 

£4 = (*-') (^T^) * !/(*-•) fori > *. 

Further, as X)i> s 2*=i P»j — KS by ( |22| ). we see that J2i> s E^=i P?; < (ks) 2 . Combining these bounds, 
we obtain 

\\p\\l < s + 1 + {ks) 2 < s + l + ln _2 fc [ass<fc "]. (27) 

By estimating the derivative of the function zh> | ln(l — 2/k + k~ 2 z 2 ), we obtain from (l27l i 



2 



£ ln(l - 2/k + k~ 2 \\p\\l) < - ln(l - 2/k + k~ 2 \\p, s \\ 2 2 ) + o(l/k). (28) 



Finally, combining (T2TT) . ( f26b and ( 1281 ). we obtain /(/>) < f(p) < f(p s ) + o(l/fc) < f(p), as desired. 
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Appendix 

Throughout this appendix we assume that k > ko and n > tiq are sufficiently big. We assume that 

d = 2fclnfc -hxk — c with c = 21n2 + o fc (l), 

unless specified otherwise. 

In Appendices lAWEl we give a complete, self-contained proof of Theorem ll.il Furthermore, Appendix |F] 
contains the proof of Proposition 12. 21 



A Preliminaries and notation 

This section contains a few facts that we are going to need in due course. We need the following Chernoff 
bound on the tails of a binomially distributed random variable from J24l p. 21]. 



Lemma A.l Let tp(x) = (1 4- x) ln(l + x) — x. Let X be a binomial random variable with mean p > 0. 
Then for any t > we have 

P[X>E[X]+t] < exp(-p-<p(t/li)), 
P [X < E [X] - 1] < cxp(-/i • (p(-t/p)). 

In particular, for any t > 1 we have P [X > tp\ < exp [— tp ln(i/e)] . 

We are going to need the following version of the chain rule. 

Lemma A.2 Suppose that g : R a — > R and f : R — > R are functions with two continuous second 
derivatives. Then for any xq € R a and with yo = g(xo) we have for any i,j € [a] 



d 2 fog 



dxidxj 



b a r a9 q2 



Of 



fc=i dyk 



d 2 g k 






d 2 f 



, dxidxj „ *—* dykdyi 



dgk 
dxi 



dgi 



We denote the Frobenius norm of a matrix by \\p\\ 2 - Furthermore, we let pi denote the ith row of p and 
Pij the jth entry of pi. Set WpW^ = max^j \p i:j \. 
Let 

h : [0, 1] — > R>o, x h-> — xlnx — (1 — x) ln(l - x) 

denote the entropy function. We use the convention that In = 0. If p = (pi, . . . , pk) is a vector with 
non-negative entries such that Yli=i Pi = 1> we define the entropy of p as 



H(p) = -J^pMPi)- 



We are going to need the following basic fact about the entropy (e.g., Il35l Chapter 1]). 

Proposition A.3 Let p £ [0,1] be such that J2i=i Pi = 1- Then H(p) > and the following two 
statements hold. 

HI. If p has precisely p non-zero entries, then 'H(p) < hip. 

H2. Let I C [k] and suppose that q = X)iei Pi ^ (0, 1). Let p x be the vector with entries pf = Pi ■ l»ei. 
Then 

H(p) = h(q) + qH(p X /q) + (1 - q)H(( P - P x )/(l - q)). 



The following is an immediate consequence of Proposition |Aj3 
Corollary A.4 Let p e [0, 1] be such that J2i=i Pi = !• 
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1. LetT C [fc] and set q = 2^2i£i Pi- Then 

H(p) < h(q) + q\n(\T\) + (1 - q) ln(fc - \X\). 

2. LetT C {2, . . . , k} be a set of size < \X\ < k — 1. Sef c/ = X)iei Pi- U Pi < 1> f ^ en 

W(p) < h(pi) + (1 - Pi)/i(ff/(l - Pi)) + <?ln(|2|) + (1 - q - pi) ln(/c - |I| - 1). 

Finally, for x G R we let 

( -1 if a; < 0, 

sign(x) = < if x = 0, 

[ 1 ifa;>0. 

We use the O-notation both for function of n and of k with the following convention. We write f(k) = 
0(g(k)) to denote the fact that |/(fc)| < C|.g(fc)| for k > fco. where ko,C > are absolute constants 
(independent of k, n). Symbols such as Cl(g(k)), o(g(k)) are interpreted analogously. In addition, we 
write f(k) = 6(g(k)) to denote the fact that \f(k)\ < In (k)\g(k)\ for all k > k , where k , C > are 
absolute constants. 

With respect to asymptotics in n, we write <p(n) = 0(ip(n)) if |0( n )l < C ' IV , ( ri )l f° r n > ^0. where 
C = C(d, k) > 0, no = no(d) fc) > are numbers that may depend on d and k but not on n. Additionally, 
we use the symbol 4>{n) ~ VK 71 ) to denote the fact that lim,,,^.^ <f>(n)/i/j(ri) = 1. 

The symbol o(l) refers to asymptotics in n, i.e., 0(n) = o(l) means that linin^oo <p(n) = 0. By 
contrast, 0^(1) denotes a term that tends to zero in the limit of large k. 

B Good colorings 

Amaper : V — > [k] is asymptotically balanced if \<j~ 1 (i) —n/ k\ < -^/nforalH. Let a, r be asymptotically 
balanced. Their overlap matrix p(a, r) is the k x k matrix with entries 

^ (a ' T) = n?fc ' 

fc 

Y^Pijfar) = l + 0(n- l/2 ) far alii G [A], 
i=i 

fc 
^ py (a, r) = 1 + O^ 1 / 2 ) for all j G [fc] . 

i=l 

Let G be a graph and let ct be an asymptotically balanced fc-coloring of G. We say that a is separable 
in G if the following is true. 

Suppose that r is an asymptotically balanced fc-coloring of G. Moreover, suppose that i, j G 
[fc] are such that py (cr, r) > 0.51. Then pij (ct,t) > 1 — n for re = In 20 fc/fc. 

Furthermore, we call an asymptotically balanced fc-coloring r of G stable with respect to a if p„ (cr, r ) > 
0.51 for alii G [fc]. 

Let Zfc_ co i be the number of fc-colorings of G(n, m). We say that a is good in G if the following three 
conditions hold. 

i. a is asymptotically balanced, 

ii. a is separable, 

iii. Let C (a) be the set of all colorings r that are stable with respect to a. Then \C (cr) | < E [Zfe_ co i] . 
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Note that 



Furthermore, let Zk. goo d be the number of good fc-colorings of G(n, to). 

Proposition B.l There exist a sequence Ek — > and a number C '(fc) > such that for d = 2fc In fc— In fc— c 
with c = 2ln2 + e k we have E [Z fe , good ] = 9(E [Z fc _ co i]) = 6(fc"(l - l/k) m ). 

The proof of Proposition [ST]can be found in Section ICl 

Proposition B.2 There exist a sequence Sk — > and a number C '(fc) > such that for d = 2fc In fc— In fc— c 
vv/f/z c = 2 In 2 + £fc we have 

E[^ 2 , good ] <C(fc)-E[Z fe , good ] 2 . 

We are going to prove Proposition lB .2 l in Section|D] The following lemma is an immediate consequence 
of the sharp threshold result flT|. 

Lemma B.3 Let d > and k > 3. //'limmf n _>. 00 P [x(G(n, to)) < fc] > 0, f/ien dk~ C oi > d — o(l). 

Finally, Theorem ll.ll is a direct consequence of Proposition s IbTTI and IB . 2l and Lemma lTOI 



C The first moment: proof of Proposition IB.l 



The following lemma regarding the expectation of Zk~ C o\ is folklore; we include its simple proof for the 
sake of completeness. 

Lemma C.l We have E [Z k - col ] = 9(fc"(l - l/fc) m ). 

Proof. Let a : V — > [fc]. Let ni = |cr _1 (i)| for i = 1, . . . , fc. Then the probability that a single random 
edge is monochromatic under a is 

Therefore, 

E [Z fe _ col ] < fc"(l - 1/fc - 0{\/n)) m = 0(fc"(l - l/fc) m ). 

Conversely, Stirling's formula shows that there are 0(fc n ) asymptotically balanced maps tr : V^ — > [fc]. For 
each of them, we have \m — n/k\ < y^ ar, d thus 

E(?)'-i + 'i:(?-i) + E(?-i)'-i + w-). 

»=1 t=l v 7 i=l v 7 

Consequently, E [Z fe _ co i] > 0(fc")(l - 1/fc - C>(l/n)) m = £7(fc"(l - l/fc) m ), as desired. D 

To proceed, we prove a few statements about the "planted coloring" model. Fix an asymptotically 
balanced a : V — > [k]. Let Vi = <y^ x {i) and let q ~ jtztT ^ e suc ^ tnat ^ e ex P ectec l number of edges 
equals to in the random graph G = G(a) in which any two vertices v, w with er(u) ^ a(w) are connected 
with probability q independently. 

Proposition C.2 We have P [ct w separable in G(a)] > 1 — 0(l/n). 

To prove Proposition lC.2l we need a few preparations. 

Lemma C.3 Lef G = G(o). Let i e [fc] a«c/ fef 0.509 < a < 1 - fc~ ' 499 . 7%e« vw'rt probability 
1 — cxp(— J7(n))/or a/Z subsets S G Vi of size \S\ — an/k the following is true: 

the number of vertices v € V \ Vi that do not have a neighbor in S is less than (1 — a)n/k — n 2 ' 3 . 

(29) 
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Proof. We may assume i = 1 without loss of generality. We use a first moment argument. For any fixed 
set S and for any v £ V \ V\ the number of neighbors of v in S has distribution Bin(an/k,q), with 
1 ~ ' fe3i ■ Hence, the number Xs of v with no neighbor in S has a binomial distribution with mean 

n(l - 1/fc + o(l)) (1 - 9 ) Q "/ fc . We have 



(1 - g ) Q " /fe < exp [-cmq/k] < 2k 



-2d 



and thus 



E [X s ] < (1 + o(l))n(l - 1/fc) ■ 2fc~ 2 ". 
Consequently, by Lemma lATK the Chernoff bound) 



(l-Q + o(l))--ln 
k 



(30) 



p 


*s>( 


1 — a)n/k - 


- n 2 / 3 " 


< exp 


— 


< exp 


— 


By comparison, the number of ways to choose S is 


/ (l + o(l))n/* \ 
^(1 - a + o(l))n/k) ~ 


^ e y 1 -^ 


D) 


U-«J 


Combining (130b. OTb and d32]>, we obtain 


E 


. s 


< exp 


f(l- 
fc 


a)n .("l ] 


n( 



(l-a + o(l))--ln 



(1 — a)n/k 

2cn/k 2a 
I- a . „. 



2e 



exp 



-(l-a + o(l))(l-ln(l-a)) 



(31) 



•(32) 



[ — ■ fc 2 "" 1 
2e 



°( n ) 



To obtain the desired bound, we need to verify that this is exp(— O(n)). That is, we need to estimate 



1 - ln(l - a) - In 



1-a 

2e 



■ k 



1a-\ 



= In 



2e 2 



.1-2q 



(l-«) 2 



This is negative iff 



exp 



a I In k 

2 



< 



1 — a 

V2c ' 



(33) 



By convexity, the exponential function on the l.h.s. and the linear function on the r.h.s. intersect at most 
twice, and between these two intersections the linear function is greater. Further, it is easily verified that 
the r.h.s. of d33l is larger than the l.h.s. at both a = 0.509 and a = 1 — fc -0499 . Thus, ( T33l l is true in the 
entire range 0.509 < a < 1 - k- QA ". D 



Lemma C.4 With probability 1 — exp(— Q(n)) the random graph G(a) has the following property. 
Let i G [k]. No more than 0(n/k 2 ) vertices v 2" V% have less than 15 neighbors in Vi. 



(34) 



Proof. For each vertex v £ V \ Vi the number of neighbors of v in Vi has distribution Bin(|Vi|, q). The 
mean of this is A ~ ^q > 2 In k. Hence, the probability that v has fewer than 15 neighbors in Vi is bounded 
by (1 + o(l))A 15 exp (—A) = 0(l/k 2 ). Furthermore, the event of having fewer than 15 neighbors in Vi 
occurs independently for all v £ V \ Vi. Therefore, the number of vertices with this property is binomial 
as well. Thus, the assertion follows from Lemma lATI fthe Chernoff bound). □ 



(35) 



Lemma C.5 With probability 1 — 0(\/n) the random graph G{a) has the following property, 
for any set S C V of size \S\ < k~ 4 ' 3 n we have e(S) < 5\S\. 
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Proof. We use the union bound. Let d! = qn ~ t^j-. The probability that there is a set S of size 
violating (l35l l is bounded by 



Van/ \5cm/ ~ Va/ I 10 



ed' i 

e | — a 

10 



Summing over all possible values of a completes the proof. □ 

Proof of Proposition \C.2\ We need to show that the following holds w.h.p. 

Let t be an asymptotically balanced fc-coloring of G(a) and let i <G [fc] be such that t(v) = i 



for at least 0.51n/fc vertices v £ Vi. Then \{v e V t : t(v) = i}| > r (1 — 



K 



The proof is based on ( |29] l, ( f34b and d35l l. all of which holds with probability 1 — 0(l/n) by Lemmas IC. 31 
IC4landlC31 

Assume, without loss of generality, that r is asymptotically balanced and |r _1 (l) n V\\ > Q.bln/k. 
Then by d29b we may assume that [t^ 1 ( 11 n Vi \ > f (1 - fc" 049 ). Indeed, let S = r _1 (l) n V\ and let 
T = r^^l) \ Fi. Then SUT = r _1 (l) is an independent set and |T| > f - |5| - n 1 / 2 , because r is 
asymptotically balanced, in contradiction to (|29i l. 

Let Q be the set of all vertices v G t _1 (1)\¥ 1 such that e(u, Vi) > 15. Moreover, let R = Vi\t~ 1 (1). 
Then e(v, R) = e(v, V\) for all v € Q. Further, as a, r are asymptotically balanced we have \R U Q\ < 
n/k 4 / 3 . By the definition of Q and d35l l. we have 

15\Q\<e(RUQ)<5\RUQ\, whence \Q\ < \R\/2. (36) 

Let W = r _1 (1) \ (QU Vi). Since ct, t are balanced, we have 

|r- x (l) n Fi| + |i?| ~ J~ Ir-^l) n Vi| + |Q| + |W|. 

Hence, by d36b 

|i?| = |Q| + |W| + o(n)<^ + |W| + o(n), 

and thus |iJ| < 2|W| + o(n) = 6{n/k 2 ) by ® • □ 

To estimate |C(ct)|, we need the following concept. Let £ = 100 for concreteness. 
Definition C.6 Let a be an asymptotically balanced k-coloring of G. 

1. The (.-core of (G, a) is the largest induced subgraph (V, E') of G such that for all v G V and all 
i y^ &(v) we have 

\{weN(v)nV :a(w) = i}\ >L 

(In words, v has at least I neighbors of color i in V 1 for any color i ^ o~(v).) 

2. Let V be the (.-core. A vertex u € V is a-free if 

\{ie [fc] :N(u)r\V'r\a- 1 (i) =0}| >o+l. 

3. A vertex that fails to be 1-free is complete. 

Note that all vertices in the £-core are complete. In Section ICTI we are going to prove the following. 
Proposition C.7 With probability 1 — exp(— fl(n)), G(a) has the following properties. 

1. At most r(l + 0(1/ k)) vertices are 1-free. 

2. At most 0(k~ 2 )n vertices are 2-free. 
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Lemma C.8 With probability 1 — 0(l/n) the random graph G = G(a) has the following property. 

Assume that a is separable. Then for all complete v and all t G C (a) we have a(v) = t(v). 

(37) 

Proof. The proof is based on an expansion argument. By Lemma IC.5I we may assume that G has the 
property (135V Let V' be the vertex set of the ^-core of (G, a). Let 

A+ = {veV : t{v) = i ^ a(v)} , 
A" = {veV : t{v) ^ i = a(v)} . 

Then 

k k 

J2 |A+| = \{v € V : a(v) + r(v)}\ = £ |A"| . (38) 

1=1 8=1 

Furthermore, our assumptions that a is separable and that both cr, r are asymptotically balanced imply that 

Ti Ti 

max|A+|<(« + (l))-, max|A-|<( K + (l))-. (39) 

ie[k] k iG[fc] k 

Our goal is to show that {v <G V : er(i>) ^ t(v)} = 0. This implies by construction that indeed a(v) = 
t(v) for all complete vertices v. 

Let Si = A+ U A 4 " for i = 1, . . . , k. Then ([39j implies that \S t \ < k~ 3 ^n for all i. Therefore, d35) 
implies that e(Si) < 5\Si\ for all i. Because r is a fc-coloring, we have N(v) n o" _1 (i) C A~ for all 
v G A^ . Since A^ is contained in the core, this implies that 

£|A+| < e(A+,A") < e(Si) < 5\S t \ < 5(|A+| + |A"|). 

Consequently, we have | A~ | > 2 | A+ 1 for all i. Thus, (|38) yields A+ = A~ = for all i. D 

Corollary C.9 With probability 1 — 0(l/n) the random graph G = G(c) w smc/i f/jaf 

|C(a)|<2^( 1+6 ( 1 / fe ». 

Proof. By Lemma lCT8l we may assume that ( |37| ) holds. Then for each vertex v that is 1-free but not 2-free 
there is a set C v C [k] of size 2 such that r(u) G C v for all r G C (cr). Furthermore, the total number of 2- 
free vertices is 0(k~ 2 )n. Hence, if we letFj be the number of j-free vertices, then \C (cr) | < 2l Fl \ F2 lfcl F2 L 
Thus, the assertion follows from Proposition lC.7l D 



Proof of Proposition \B.l\ Let u be an asymptotically balanced coloring. Consider the following variant of 
the planted model: let G'(a) be the graph obtained by choosing independently m edges that are bichromatic 
under a. This is equivalent to choosing G(a) conditional on the total number of edges being precisely m. 
Since the probability that G(a) has precisely m edges is ^(n" 1 / 2 ), we conclude that 

P [G'(er) G A] < 0(y/n) P [G{a) G A] for any event A. 

Letting A be the event that either \C (a)\ > 2^( 1+ °( 1 / fc ^ or that a fails to be separable, we obtain from 
Proposition lC.2l and Corollary IC9l that 



a is separable in G'(a) and \C (a)\ < 2% {1+ ° {1/k)) 



(40) 



Now, let A CT be the set of all G such that a is a /c-coloring of G. Let A = {J a A CT x {cr}, with a ranging 
over asymptotically balanced maps V — > [fc]. In addition, let A ffCT be the set of all G such that cr is a good 

fc-coloring of G and let A 9 = \J a A g . CT x {cr}. Let N = (00). Then 

E[Z fc , good ] = /V- 1 |A 9 |-iV- 1 ^|A g , a | , i , iV- 1 ^|A CT |=/V- 1 |A|=E[Z fc _ col ]. 

cr cr 

Thus, the assertion follows from Lemma ICTl □ 
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C.l Proof of Proposition IC771 



Recall the Definition IC. 61 of an £-core. Observe that the 1-free and 2-free vertices lie outside the core. 
Hence, as a step towards a proof of Proposition IC.7I we estimate the size of the core. To this end, we 
follow arguments developed in (2). As it turns out, the number of non-core vertices is 0(k^ 1 )n w.h.p. 
Unfortunately, this doesn't quite meet the required bounds in Proposition lC.7l Thus, additional arguments 
are needed to obtain the required bound on the number of 1-free and 2-free vertices. 

Proposition CIO With probability 1 — e~ n ( n \ the graph G(a) has a 100-core C containing all but 
0{k~ l )n vertices. 

The proof of Proposition lC.lOl is constructive. The process of constructing a core is rather similar to the one 
defined in |2][8], where vertices of low degree in some color class are being iteratively removed. Expansion 
properties ensure that this process converges quickly. Set £ — 100, and define the following sets. 

1 For i^ j, Wij = {v e Vi : e(v, Vj) < U}, and W u = 0; W t = uf =1 W^-; W = uf =1 Wi. 

2 For i y£ j, Uij ={veV l \W : e{v, Wj) > £} and U = U^U^. 

3 Set Z(°) = U, i = 0; While 3v e V \ Z® that has more than £ neighbors in Z®, define Z^ +1 ^ = 

Z^Ll{v};i=i + l. 

Let Z be the final set Z^> of the iterative procedure just described. It is now easy to see that G\ [V\ (WLiZ)] 
satisfies the core property. To complete the proof of Proposition lC.101 we bound the sizes of W, U and Z 
(Lemmas ED] EH and|CT3). 

Lemma C.ll With probability at least 1 - e~ n< - n \ for every i, \Wi\ < n-d(k~ 2 ), and\W\ < n-d(k^ 1 ). 

Proof. Fi\i,j,i ^ j, and let us bound \Wij\. Fixavertexu S Vi, and let p, be the expected size of e(v, Vj). 
Since \e(v, Vj)\ is binomially distributed, we have [i = pn/k = 21nfc — ' ~ ^ . Using the Chernoff 
bound (Lemma fA.ll i. with t = p— 3£ ~ /j, — 300, and assuming k is sufficiently large, we obtain 

P [\e(v, Vj)\ < 300] < exp{-21nfc + O (lnlnfc)} = 0(fc- 2 ). 

By the linearity of expectation, E[|Wij|] < 0(k~ 2 ) ■ j = n ■ 0(k~ 3 ). Using the Chernoff bound once 
more, with t = 0(n), we obtain that with probability at least 1 — e~^ n \ \Wij \ < 0(k~ 3 )n. There are k 2 
indices i,j, hence using the union bound, with probability at least 1 — k 2 e~ n ( n ^ = 1 — e~ n< - n \ all W^-'s 
satisfy |Wy| < 0(fc" 3 )n. The size of Wis then given by \W\ = | U itj Wij | < fc 2 -n-0(fc" 3 ) = n-Oik' 1 ), 
as required. D 

Lemma C.12 With probability at least 1 — e~ n ^ n \ the set U satisfies \U\ < n/k 30 . 

Proof. We define two sets whose union contains Uij : 

U\j ] ={veV: e(v, Wj \ Wji) > 1/2}, U\f = {v € V : e(v, W Ti ) > 1/2}. 

Since Uij C J7>. U \j\, , it suffices to bound the size of each U™ , m = 1,2, separately. The easier 

case is U>, , since the edges e(v, Wj \ Wji) are independent of the edges that were used to determine 
Wj \ Wji (as they involve other color classes). Thus for every v G Vi, the random variable e(v, Wj \ Wji) 
has a binomial distribution Bin(|Wj \ Wji\,p). Lemma IC. 1 1 1 entails that with probability 1 — e~ n ( n \ 

\Wj \ Wji\ < n ■ k~ 2+ °d). Conditioned on that being the size of Wj \ W i% , E[|e(v, W, \ Wji)]} < 
p ■ n ■ k~ 2+ \k) < \nk/k. Using the Chernoff bound in Lemma lATI with t = 150 — Ink/k, we obtain 
that P[|e(w, Wj \ Wji)\ > £/2] < fc -50 . Using the linearity of expectation, and the Chernoff bound once 
more, we obtain that with probability at least 1 — e~ n ( n \ jL^ | < n/k 40 . Finally, 



Pr 



U$P\ > n/k 40 } < Pr \\Wj\ > n ■ k~ 2+0 d)}+Pr \\U^\ > n/k 40 \ \Wj\ < n ■ k~ 2+ °^) 



< e- n{r 
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Now we shall bound the size of U\a . In the proof of Lemma IC.11I we establish the fact that with 

probability at least 1 — e~ n( * n \ for all s, t, \W st \ < n ■ k~ 3+0 y~£) . Again, we condition that this is the 
case for Wji. The information revealed by exposing that u £ Wji, is that the number of neighbors of u 
in Vi doesn't exceed £. Since we are interested in upper bounding the probability that v £ U^ , i.e. the 
probability that v £ V has many neighbors in Wji, removing the cap on the degree only works against 
us. Hence, our random experiment is dominated by upper bounding the probability that v has more than I 
neighbors in a subset of size nk~ 3+ w of Vj. The expected number of neighbors is smaller than \nk/k 2 , 
which is smaller than the already analyzed case of J7\ '. To conclude, we obtain that P[\U i ■ ■ | > n/k 40 ] < 
e -o(n) rp Q conc j u( j e tne proof of Lemma lC. 121 with probability at least 1 — e~ ni ~ n \ 

\UV\ + \U™\ < 2n/k 40 < n/k 30 => \U tj \ < \uV\ + |£/f | < n/k 30 , 
as claimed. D 

Lemma C.13 With probability at least 1 — e~ n ^ n \ the set Z satisfies \Z\ < 2n/k 30 . 

Proof. Lemma [C. 121 entails that with probability at least 1 — e~~ "("), \{J\ < n/k 30 . Assume that this is 
indeed the case. Further suppose that the iterative step that defines Z reaches step i* = n/k 30 . Let us stop 
the process at this point, and let Z* = Z^ % '. By the construction of Z*, the graph induced on U U Z* 
contains at least i*£ edges, and also i* > \U U Z*\/2. Let us now bound the probability that G contains a 
set S of at most 2n/k 30 vertices (and at least n/k 30 ) which spans £\S\/2 = 50\S\ edges. This is given by 
a standard first moment calculation: 



2n/fc 3 



i/fc3° 



n\ I s 
50s 



J2 (")(rn )P 5 ° S 



The first term accounts for choosing the vertices in S, the second term for choosing the edges, and the 
last term accounts for all edges being present. In upper bounding the sum we used the standard inequality 
(?) < (e»/t)*. □ 

Proof of Proportior AC. 71 In what follows, when we use i and j to index a color class, we assume that i ^ j. 
We define two sets of vertices, which capture the 1-free and 2-free vertices. Let So be the set of vertices 
that have zero neighbors in some color class other than their own. Let Si be the set of vertices satisfying: 

Si = {v e V \ S : 3i,j G [k] s.t. v G V t and N{v) n V 3 C Wj}. 

A glimpse at the construction of the £-core shows that any 1-free vertex v has one of the following prop- 
erties: (PI) i) e So, (P2) ueSi or (P3) v e N(Z) U Z (Z is the set referred to in Lemma lCl3) . 
Our goal is then to bound |So|, |Si| and |7V(Z)|. To bound So, fix i,j £ [k] and v £ Vi. The proba- 
bility that N(v) nVj = is (1 — p) n / k = k~ 2 (l + 0(fc -1 )). There are k ways to choose j, therefore 
P[v £ So] = /c _1 (l + 0(fc -1 )). Using the linearity of expectation and standard concentration results we 
obtain that \S \ =n- A: _1 (l + 0(fc -1 )) with probability 1 - e~ n ("). To bound |Si|, define the bad event 
B Vt j for every v £ V and J C [k]. B Vt j — "v has less than five neighbors in all color classes j £ J". 
Similar calculations to the line before show that P[Pu.j] = 0(fc~ 2 l ,7 l). Now, 

P[v € Si] = J2 p i v e Si\Bv,j] P[B v .j] < P[v £ Si\B Vi9 ] + ^ P[« e Si\B v ,j]Pr[B v ,j] + 0(fc- 2 ). 

JC[k] \J\=1 

Consider a vertex v £ Vi that has h neighbors in Vj . The probability that all of them fall in Wj is at most 
( | Wj | • k/n) . Here we used the fact that the edges of v are chosen independently of each other, and that the 
probability of an edge connecting v to u £ Wj is negatively correlated with event u £ Wj . We can further 
condition on the event maxj \Wj \ — 0(k~ 2 )n, as this happens with probability 1 — e~ n ( n ) (Lemma lC.111 1. 

Therefore, for \J\ = 1, P[v £ Si\B v ,j]Pr[B v .j] < Oik' 1 ) ■ 6{k- 2 ) = 0{k~ 3 ), and Pr[v £ S x \B vA ] < 
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k ■ 0(k 5 ) = 0(k 4 ). Plugging this back in the equation above, we get P[u£ S\] = 0(k 2 ). Using the 
linearity of expectation and standard concentration results we obtain that |Si| = n-0(k~ 2 ) with probability 

To complete the proof we need to upper bound |AT(Z)|. Lemma IC.13l savs that \Z\ < 2n/k 30 . Fix a 
set A of at most this size, then E [|iV(A)|] < p\A\ < n/k 20 . Now we have to show that with probability 
1 — e ( n \ all sets A of this size have no more than, say, n/k 10 neighbors. Using the Chernoff bound, one 
can compute the probability of a set A having more than n/k 10 neighbors, and then use the union bound 
over all ( n /£ 20 ) sets A Tms way one shows that \N(Z)\ < n/k 20 with probability 1 - e~ n( - n \ Details 
omitted. 

To conclude, the number of 1-free variables is at most |5o| + \Si\ + \Z\ + \N(Z)\ which is at most 
n- fc _1 (l + 0(fc -1 )) with probability 1 — e~ n ( n \ Turning to the 2-free variables, the analysis in this case 
is somewhat simpler. Let us define T to be the set of variables that have less than five neighbors in at least 
two color classes. Using the same arguments as in the upper bound on So, we get that P[v € T] = 0(k~ 2 ). 
Now, any vertex v that is 2-free has one of the following properties: (Pl)v € T, (P2)v G Si or (P3)v <G 
Z U N(Z) As we saw above, the number of vertices satisfying either of the properties is (J(k~ 2 )n with 
probability 1 — exp(— Q(n)), which completes the proof. □ 



D The second moment 

For a k x k matrix p with entries in [0, 1] we define 



H(p) = In k - - Y^ Ptj ln PH = ln k + T 5Z n (P^ 



i,3 = ~L 



(where p\, . . . , pk are the rows of p). Furthermore, let 



E(p) = - • In 



1-T + 



k k 2 



and f(p) = H(p) + E{p). Moreover, let p = -|l be the matrix with all entries equal to 1/fc. 
Using the expansion ln(l + z) = z + z 2 /2 + 0(z 3 ), we can approximate E(p) by 



E{p) = 



2k 2 



-2fc+||p||^-2 1- 



i ii 2 

\Ph 

2k 



o(l/k). 



(41) 



We are also going to need an estimate of the differential of E with respect to the Frobenius norm: we have 

(42) 
0(fc- 3 ). 



T v ^-*/k + y/k 2 ) = 2k2{1 _ 2 d /k + y/k2r l f(^0(i /k)) , 
d 2 d. 



(43) 



~ln(l-2/fc + y /fc 2 ) = 2fc4(1 _ 2/fc + 2//fc2)2 

D.l Overview 

Let B be the set of all asymptotically balanced maps V — > [k]. Furthermore, let 

K = {p(a,T) :<t,t eB} 

be the set of all possible overlap matrices. In addition, for any p e 7Z let B p be the set of all (a, r) € B 2 
such that p = p(a, r). Further, let 

Z p ,good = I {(c, t) <S B 2 : both a, r are good fc-colorings of G(n, m)} I . 
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Then we can write the second moment as 

E Kgood] = E E [ Z P^od] ■ (44) 

peiz 

Proposition D.l For any p € 1Z we have E [Z pgoo d] < cxp [o(n) + nf(p)], and if \\p — p\\ < 1/fc 2 
then 

E [Z p , good ] < 0{n^~ k ^l 2 ) cxp [nf{p)\ . 



Proof. For any peKwe have 



Hence, by Stirling's formula, 



\ B p\< kn [ (n n? 

<KPiJkkj&[k) 



1/2 



Now. 



\B P \ < 0[= E ) exp(nH( P ))<exp[nH(p) + o(n)}. (45) 

\li(i,jy. Pi j>oPvkJ 

, consider (a, r) £ B p . Let Ri = J2j=i Pij an d Rj = Sj=i P*j- Then for each i e [k] the number 
of unordered pairs {«, w} of vertices such that a(v) = a(w) = i equals ( 2 )■ Similarly, the number 
of such paris {v, w} with t(v) — t(w) = j equals ( j ™ ) . Further, the number of pairs {v, w} such that 
both <t(v) = a(w) = i and t(v) = t(w) = j is ( PiJ ™' ') . Therefore, by inclusion/exclusion the probability 
that a random edge is bichromatic under both a, r is 

k (Rin/k\ k (R'n/k\ k (Pijnjk\ 

a - 1 V LaJ ^ \ 2 ) , TC- { 2 ) 

q P i Z^ (n\ Z^ (n\ ^ Z^ f'A 

i=l \2) j = l \2) ij=l \2J 

Since Yli=i Ri = X),=i ^ = &> Cauchy-Schwarz implies that 5Zi=i ^-hJ2j=i Rj — &■ Hence, 

^<l-| + i!gi + 0(l/n). 



As a consequence, 



P [<r, r are fc-colorings of G(n, m)] = g™ < O 



k k 2 



0(l)-exp(n£(/9))j(46) 



Combining d45b and (J46jl, we see that E [Z Pig ood] < cxp [o(n) + n/(p)] . Furthermore, if \\p — /o|| < 
1/fc 2 , then the first inequality in (|45T > and (|46T > yield E [Z P!good ] < C^n' 1 "* )/ 2 ) cxp [n/(/o)] , as claimed. 

□ 

In Section lL\2l we are going to prove the following. 
Lemma D.2 There is a number r\ = ?y(/c) > {independent of n, of course) such that for the set 

K 1 ={p£K:\\p-p\\ 00 < V } 
we have J^peiz, E [^P,good] < 0(cxp(f(p)n)). 

Let 72-0 be the set of all p <G 7Z such that there exist i, j with p^ <E [0.51, 1 — k). Further, let 1Z S be the 
set of all p <G 1Z such that for each i there is j such that py > 0.51. Let 7?. 2 =TZ\ (IZq U Tvli U 7?. s ). The 
core of the second moment argument consists in establishing the following. 
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Proposition D.3 There is a number £ = £(fc) > such that for all p € l^i we have /(p) < f{p) — £• 
We defer the proof of Proposition ID . 3 I to Section|E] 



Proof of Proposition \B.2\ Because every good fc-coloring is separable, we have Z p ^ 00< \ = for all p 6 1Z$ 
with certainty. Indeed, assume that a, t are good colorings such that Pij(a, r) <G [0.51, 1 — k]. Then by 
swapping colors i and j in r we obtain a coloring t' such that pij{a, r') £ [0.51, 1 — n], in contradiction 
to the assumption that a is good. 

Further, the definition of "good" ensures that 

Y^ E [ z P-sood} < k\ ^ E [C (a) \a is good] ■ P [a is good] 

pen s <j 

< fc!E [Z fc . good ] ■ maxE [C (a) \<r is good] < fc!E [Z k _ col ] 2 . (47) 



Furthermore, by Propositions ID . 1 1 and ID . 3 1 

y~] E [Z p , good ] < V] exp [/(p)n + o(ra)] < exp(o(n)) max exp [/(p)n] = o(exp [f(p)n]). 

*-^ * ' PGK.2 

p£iz 2 p&n 2 

(The second inequality follows because |72.2| < 1^1 < n k = exp(o(n)).) Hence, d44b . d47l and 
Lemma ID~2l implv that 

E[^good] = o(exp[/(p)n]) + £ E [Z Pigood ] < 0(cxp [/(p>] + E [Z fc „ col ] 2 ). (48) 

To complete the proof, we need to compare exp [f(p)n] and E [Z Pig0 od] : since H (p) = 2 In fc and 

£(p) = - ln(l - 2/fc + 1/fc 2 ) = dln(l - 1/fc), 

Proposition |B~T1 yields exp [f(p)n] = fc 2n (l - l/k) 2m = 0(E [Z fe _ col ] 2 ) = (9(E [Z fc , good ] 2 ). Thus, the 
assertion follows from (|48| |. D 

D.2 Proof of Lemma IB3I 

Let TZ\ be the set of all p E 1Z such that Up — p\\ < r\ for some small r\ = rj(k) > that we will specify 
in due course. By Proposition lD.il we have 

^E[Z p , good ] < 0(n^- fc2 )/ 2 ) J2 exp(n/(p)). (49) 

t. 

By construction, we have 2_^ 7=1 Pij = k for all p <E Tci. Therefore, we can parametrize the set 7?.i as 
follows. Let 



£:[0,lf 1 ^[0,l] fc , p = (py)(i,j) e [fcp\{(fc,fc)}H-£(/3) = (£ij(p))» 



je[fc] 5 



where £y(p) = py for («', j) 7^ (fc, fc) and £fcfc(p) = fc - ^^(fc.fc) ^' Let 7 ^ 1 = £ H^i)- Then £ 
induces a bijection 7?.i — > 7£.i. Thus, 

Y, exp(n/(p)) = £ cxp(n • / o £(p)). (50) 
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To study the function / o £ = H o £ + E o £ for p £ 1Z\, we compute its first two differentials. A 
direct calculation yields for (i,j) ^ (k,k) and (a,b) £" {(i,j),(k,k)} 

—HoC(p) = - In 

« £jj (p) 



dpij 

i) 2 



g . r Ho W) 



1 



1 



if 



-Ho£(p) 



£ij{p) JC-kk(p) 
1 



dpijdp ab 

Furthermore, letting J-(p) = ||/o|| 2 , we find 
9 



k£kk(p) 



-To£(p) = 2{£ l] {p)-£ kk {p)), 



Thus, by Lemma lAT2l (the chain rule), (l42b and ( l43l i 



^o£(p)=4, 



t) 2 



dpi 3 dp ab 



To£(p) = 2. 



d 
dp^ 

d 2 
dpi 



Eo£(p) 



Eo£{p) 



d 2 



dpijdpa, 



■Eo£(p) = 



(djjp) - C kk (p)) ■ d 

k 2 {l-2/k + Fo£{p)/k 2 y 

2d 
k 2 (l-2/k + To£(p)/k 2 ) ~ k i {\-2/k + Fo£(p)/k 2 ) 2 

d 2d(C ij (p) - £ kk (p)){£ ab {p) - £ kk (p)) 



2d{£ l3 {p) - £ kk {p)f 



= 0(1/*), 



k 2 (l-2/k + T°£(p)/k 2 ) 

6(i/fc). 



fc 4 (l-2/fc+J'o£( / o)/fc2)2 



In particular, we see that Df o £(k~ 1 l) = 0. Furthermore, for 77 small enough the second partial differen- 
tials of E o £ are all positive for all p £ 1Z\. Hence, the Hessian of E is positive definite, while that of H 
is negative definite. Their sum is negative definite due to the above 0(l/k) term. More precisely, we find 
that there exist constants £ > 0, r\ > such that for all p £ Tt\ we have 



foc{p)<m-t y. (^i-v*) 5 

(MV(fc.fc) 



(51) 



Thus, we obtain 



^E[Z Pigood ] < expC/tpJnJ.OCnt 1 -*^ 2 ) ^ exp 



-n-£ ^ 0^-1/fc) 2 

(jj)#(fc,fc) 



< exp (f{p)n) ■ O(l) / exp 



(If 



< exp (/(p)n)- 0(1) 



/oo 
exp [— £z ] dz 
-oo 



-e 5] (z l3 -i/k? 

(ij)^(k.k) 
k 2 -l 

= 0(l)-exp(/(p», 



as desired. 



E Proof of Proposition ID.3 



E.l Overview 

Throughout this section we let S be the set of all stochastic k x k matrices. Let k = 0(l/fc) be as in the 
definition of "separable" above. We call a k x k matrix p separable if pij £ (0.51, 1 — k) for all i,j £ [k]. 
Further, we call p stable if for each i there is j such that pij > 0.51. Let R be the set of all separable 
doubly stochastic matrices that are not stable. 
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Lemma E.l Let ?y > 0. Suppose that there is a number 7 > such that for all p E R such that 
\\p — pW^ > rj we have f(p) < f(p) — 7. Then for all p' E 7^2 we have f(p') < f(p) — 7/2. 

= OipT 1 ! 2 ). To see this, recall that 



1 + a > 1, there 



Proof. For any p E 72-2 there is p' E R such that \\p — p'W^ 

J2i 7=1 Pij = fc by construction. Hence, while there is a row i such that ^7=1 

must be another row I such that Xw=i Py = 1 — a' < 1. Thus, by replacing row i by (1 — a")pi and row 

I by pi + a" pi for some suitable a" > 0, we can ensure that at least one of the row sums is one. After at 

most k — 1 steps, we thus obtain a stochastic matrix p" such that ||p — //'H^ = 0(n -1 / 2 ). Repeating the 

same operation for the columns yields the desired doubly stochastic p' . (Note that the column operations 

do not affect the row sums.) The assertion follows from the continuity of /. □ 

Thus, our goal is to prove that p is the unique global maximum of / on R. The proof of this proceeds 
in several steps. In Section ESI we will prove the following. 



Proposition E.2 For all p £ S such that p^ £ [0.15, 0.51]/or some pair (i,j) £ [k] we have f(p) < 0. 

We say that p £ R is s-stable if pij E" (0.15, 1 — k) for all i, j £ [k] and if there are precisely s index 
pairs (i,j) £ [k] such that p^ > 1 — re. In Section lE31 we are going to show the following. 

Proposition E.3 Let 1 < s < k - 999 . Then for all s-stable p£ Rwe have f(p) < f(p) - l/k + o(l/fc). 

Furthermore, in Section lK4l we will prove 

Proposition E.4 Suppose that k " 9 < s < k — k 0A9 . Then for all s-stable p £ Rwe have f(p) < 0. 

In addition, Section lE31 contains the proof of 

Proposition E.5 Suppose that k — k~ 0A9 < s < k — 1. Then for all s-stable p E Rwe have f(p) < 0. 



The proofs of the Propositions IE.2tfE.5l are based on a local variations argument. Roughly speaking, 
we are going to identify a small subset i? max of R such that for all p E R\ i? max a higher function value 
can be attained by tweaking p slightly in the direction of some point in i? max - For the points in i? max we 
can estimate the function values explicitly, and the value of p will emerge to be the largest. We start with 
the following simple fact about the partial differentials of /. 



Lemma E.6 Let p E S. Let i,j, I E [k] and set S = pu 



sign' 



df df 



dpij dp lL 
Proof. A direct calculation yields 



sign { 1 



Pij 



Pij. Suppose that pij, pu > 0. Then 

( d 

exp 



Of 



df_ 
dpii 



Substituting S = pu 



dpij 

Pij , we find 



{ PJL 

Ptj 



k *(l-| 



Pil - Pij 



\p\\l) 



Ml 



Inl^i 



Pil - Pij 



k 1-1 



1 11 11 2 
IMI2 



In (1 + 5/ p^) 



k(l 



l +^\\p\\l) 



k 2 



Taking exponentials yields the assertion. 

Lemma E.7 Let p be a stochastic k x k matrix. Let i,j£ [k] and assume that p^ > 0. 



n 
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1- If 



> 



ph k{i-i + ±\\ P \\iy 

then there exists a unique 5* > such that 



1 -\ = exp 

pij 

Furthermore, for all < 5 < S* we have 



(52) 



MI-I + 1MHI2) 



<T 



H cxp 

ft* L*a-i + ^iHi2) 

2. If ( 1521 ) c/oes no? /10W, then for all 8 > we /2a ve 



1 H < exp 



>0. 



fc(l-f + ^l|p|l2) 



Proof There is at most one 5* > where the straight line 8 \-t 1 -\ — — intersects the strictly convex 



function 



I'u 



8 n- cxp 



hi- I + Mm) 



■ s 



In fact, there is exactly one such 8* iff the differential of the linear function is greater than that of the 
exponential function at 8 = 0. Since 



— d — 

88 \ Pi:j 



8 



Hi-i + -k\\p\\i) 



s=a 



5=0 



1 
Pij 



1 \\„\\ 2 \ , 



Hi-i + &\\p\\l 



the assertion follows. □ 

Corollary E.8 Suppose that p £ S. Let i £ [k] and J C [k] be such that for some A > 1CP 4 we have 

I J\ > k x and max p^ < A/2 - 10/lnfc. 

Let p be the matrix with entries p a \, = p a bfor all (a, b) (jL {i} x J and 

Pia = \J\~ y] Py /or 8£j. 
J'GJ 

Then f(p) < f(p). In fact, if p ^ p, then f(p) < f(p). 

Proof If p^ = for all j £ J, then p = p and there is nothing to show. Thus, assume that Yljej Pn > *-*■ 
Suppose that p £ S maximizes f{p) subject to the conditions 

i- Pab = Pab for all (a, b) ^{i}xj and 

ii. maxjgjpij < maxj £ j p^. 






Such a maximizer p exists because i.-ii. define a compact domain. 

We claim that p.y > for all j £ J. Indeed, assume that p^ = for j £ J but pu > for some 
other I £ J. Then there is £ > such that the matrix p' obtained from p by replacing p^ by £ and p^ by 
pa — £ satisfied J(p') > /(p). This is because the derivative of the function z n- — zlnz tends to infinity 
asz^O. Hence, we obtain a contradiction to the maximality of p. 

Thus, let a be such that pi a = min., e jpy > 0. Because p is stochastic, we have ||pj| 2 £ [1, k] and 
|J|p*ia < HjejPij < !■ Therefore, 

1/Pia > \J\ > k x >31nfc> 5 . (53) 

" fc(l-2/fc+||p||2/fc2) 

Thus, d52b is satisfied. Our assumptions 5 = A/2 — 10/ Ink and A > 10 -4 ensure that 

( dS \ ( 2<51nfc \ 

CXP l v fc(l-2/fc + !|p!|^/fc 2 )y' " CXP [(l-l/k)i) 

< cxp(-10)/c A < exp(-10)| J\ < 1 + 5/p ia . (54) 

Let b € J be such that p.;b = max^gj pV,-. Assume that 5 = pu, — pi a > 0. Since 5 < p.^ < <5, d53l 
and ( l54l i yield in combination with Lemmas IE. 61 and [E771 that 



a/ df 



dpia Opt, 



>0, 



in contradiction to our assumption that p is maximizes /(p) subject to i.-ii. Hence, mirijgj p^ = pj a = 
Pib = maxjgj pjft, which means that p = p. □ 

Proof of Proposition ID. 31 Propositions IE.2tfE.5l implv that /(p) < /(p) for all p £ i? that are s-stable 
for some 1 < s < k. Thus, assume that p £ R is 0-stable and /> / p. By Proposition IE. 31 we may 
assume that ||p|| < 0.15, as otherwise f(p) < 0. Hence, Corollary IE. 81 (applied to each row i with 
J = [k] and A = 1) implies that /(p) > /(p). Thus, because / takes a (global) maximum on R, we 
conclude that this maximum must be attained at p and that /(p) < /(p) for all p ^ p. Indeed, as the 
set i?„ = {p £ R : \\p — pH^ > 77} is compact and because / is continuous, there is £ > such that 
f(p) < f(p) ~ ? f° r a ^ P ^ -^*- Thus, the assertion follows from Lemma IE~T1 D 

E.2 Proof of PropositionlO 



In order to establish Proposition IE. 2 1 we are going to determine the maximum of / over the set S of singly- 
stochastic matrices on a row-by-row basis. The following lemma deals with rows pi that do not have an 
entry near 1/2. 

Lemma E.9 Let p be a stochastic k x k matrix. Let i £ [k] be such that pij g" [0.49, 0.51] for all j £ [k]. 

1. Suppose that p^ < 0.49/or all j £ [k]. Let p' be the matrix with entries 

Phj = Phj and p' tj = 1/k for all j £ [k],h€ [k] \ {i} ■ 
Then f(p) < f(p'). 

2. Suppose that pij > 0.51/or some j £ [k]. Then there is a number a = 1/fc + 0(1/ k 2 ) such that for 
the matrix p" with entries 

1 - a 
Phj = Phj and p u = 1 - a, p ih = — — for all j e[k],he [k] \ {i} 

we have f(p) < f(p"). 
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Proof. To obtain the first assertion, we simply apply Corollary IE. 8 I to row i and J = [k] (with A = 0.999). 
Now, suppose that pij > 0.51 for some pair (i, j). Without loss of generality, we may assume that (i,j) = 
(1,1) and that pii > • • ■ > p\k- Let p £ S be the matrix that maximizes / subject to the conditions 

i. pn > 0.51. 

ii- Pa = Pa for all a e {2, ... , fc}. 

We aim to prove that p = p" . Since pxi < 1 — P\\ < 0.49, Corollary |R8] applies to J = {2, . . . , k} (with 

1 - Pn 



A = 0.999) and yields 



P12 



Pik 



k-1 



Let S = pn-pi2 = pn-0(l/k) andletO < A < 0.49fc be such that p n = 1-A/fc. Since ||p|| 2 e [l,fc], 
we have Q = 1 - 1/fc + 1 1 yc- 1 1 ^ /fc 2 > (1 - 1/fc) 2 . Therefore, 



exp 



fcQ 



^pn a + d(i/kj) = k 2{1 - x ' k \i + 6{\/k)). 



Furthermore, 



Define 



8 _ pn_ _ (k- l)pi 

Pl2 Pl2 



1= (l-A/fc)(l + 0(l/fc)). 
1-pn A 



£ : / H- fc 2 ' /fc 



so that 



We have 



1 + -x— I exp 
P12 



5d 
kQ 



1 1 

= (i + 6(i/fc); 



|«0 = fc 2 '/ fc 



21nfc /l 1 

7" fc 



This differential is zero at f (1 ± ^/1-2/lnfc). At p = |(1 - ^1 - 2/ In fc) = (1 + o^l))^ the 
function £ attains a local minimum, while 4(1 + y/l — 2/lnfc) > fc/2 is a local maximum. Furthermore, 
£(1) = l + 0(l/fc)and£'(l) = -l + o fc (l). Therefore, there is 7 = 0(l/fc) such that for 1 + 7 < A < /x 
we have 



1 + - — I exp 
P12 



Sd 

kQ 



(1 + 0(l/fc))£(A) < (1 + 0(l/fc))C(l + 7) < I- 



Furthermore, if A = 0.49fc, then 



1 + - — I exp 
P12 



Sd 
"kQ 



= (l + 0(l/fcMA)<fc c 



— --] 

0.49fc fc 



< 1. 



As a consequence, we have 



1 + -r— < exp 
P12 



Sd 

kQ 



foralll+7< A< 0.49fc. 



(55) 



In addition, because p is the unique local minimum of £ and because £(1) = 1 + 0(l/k) and £'(1) = 
— 1 + Ofc(l), we can choose 7 = 0(1/ k) such that 

Sd 

kQ 



1 + - — I exp 
P12 



= (l + 0(l/fcMA) 

> (1 + 6(l/fc))f(l - 7) > 1 for0<A<l- 7 . (56) 

Thus, Lemma IE76I and the maximality of f(pi) imply that pn = 1 — 1/fc + 0(l/fc 2 ), as claimed. □ 



The following lemma allows us to get rid of rows that have an entry close to 1/2. 
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Lemma E. 10 Let p be a stochastic matrix such that p^ G [0.49, 0.51] for some (i,j) € [fc] . Then there 
is a stochastic matrix p' such that p'^ g 1 [0.49, 0.51] for all j £ [fc] and such that f (p 1 ) > f{p) + -=j£. 

Proof. Without loss of generality we may assume that (i,j) = (1, 1) and that p € S maximizes / subject 
to the condition that p\\ G [0.49, 0.51]. There are two cases. 

Case 1: p xj < 0.49 for all j > 2. Applying Corollary lR8lto the set J = {2, . . . , k} (with A = 0.999), we 
see that pij = ^_f" for all j > 2 due to the maximality of f(p). Hence, Corollary IA.4l vields 

H( Pl ) < /i( / 9i 1 ) + (l-pii)ln(fc-l)<ln2 + 0.511nfc. (57) 

Moreover, because p\\ < 0.51 we have 

llpilla < 0.51 2 + (fc - 1) (j^f) < 0.261. (58) 

Let p' be the matrix obtained from p by replacing the first row by the vector (1,0,..., 0). Since 
"H(1,0, ...,0) = 0, (53 yields 

f(p)-.f{p') = H(p) H(p') + E(p) E(p') < ln2 + ° fc 51 - k + E(p) E{p'). (59) 

Furthermore, (gHJ entails \\p\\\ - \\p'\\l < \\pi\\\ - 1 < -0.739. Hence, 62) yields 

£(/>) - E(p') < -0.739(1 + 6{l/k)) lnfe/fe < -0.73 lnfc/fc. (60) 

Combining (59) and ©, we obtain /(p) - f(p') < \ [ln2 - 0.22 lnfc] < - lnfc/(5fc). 

Case 2: there is j > 2 such that pij > 0.49. We may assume that j = 2. Applying Corollary IE. 81 to 

J = {3, . . . , k} (with A = 0.999), we find that p xj = (1 - pn - pn)/{k - 2) for all j > 3 due to 
the maximality of f(p). Hence, Proposition lA.3l vields 

H(pi) < 21n2 + 0.02 lnfc. (61) 

Further, because p\ x + p\ 2 < 0.51 2 + 0.49 2 as pn,Pi2 € [0.49,0.51] and p n + p 12 < 1, we see 
that 

||pi|la < 0-51 2 + 0.49 2 + (fc - 2) ( 1 - pll _ ~ p12 \ < 0.501. (62) 

As in the first case, let p' be the matrix obtained from p by replacing the first row by the vector 
(1,0,..., 0). Then d6B gives 

f(p)-f(p') = H(p)-H(p')+E(p)-E(p') 

< - [2 In 2 + 0.02 In k]+E(p) - E(p'). (63) 

k 

Further, from (g2) we obtain ||p|| 2 - \\p'\\ 2 2 < 0.501 - 1 = -0.499. Hence, d42]) yields 

E(p)-E(p) < -0.499(1 + 6{l/k)) lnfc/fc < -0.49 lnfc/fc. (64) 

Combining (|63) and d64j, we get f(p) - f(p') < \ [2 In 2 - 0.47 lnfc] < -mfc/(5fc). 

Hence, in either case we obtain the desired bound. □ 

Lemmas |E.9I and |E. 1 01 show that for any stochastic matrix p there is a stochastic matrix p whose rows 
are either of the form l-l or that are close to the vector (1,0, ... , 0) such that f(p) > f(p). The following 
lemma provides a bound on f(p). Recall that d = 2fc ln fc — ln fc — c with c = 2 In 2 + 0^(1). 
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Lemma E.ll Let p = (pij) be the k x k matrix with entries 

1 tfi = 3 G [s] , 
ifi e [s] ,j ^i, 
1/fc otherwise. 



Then 



Proof. We have ||p|| 



£(p) 



d 

2fc 2 

d 
d 

d 

2fc 2 



s , In k cs 
k' ~2k~ 2fc2 

s/fc. Hence, we obtain from (PTTT i 

2" 



'M^s + i* 1 



1 



-2fc 



2 1 



2fc 



-2fc + s + 1 - a/* - 2 ( 1 - t^J 



s \ ■ 



o(l/fc). 

-o(l Ik) 
-o(l/k) 



-2k- 1 
-2k- 1 
2 lnfc 



1 



s,1+ fc 



-21nfc 

-21nfc 



s 
2fc2 

/ d d ds 

2fc Vfc ' fc^~ 2fc" 3 

s / In A: c 

— In k 

fc V 2fc 2fc 



o(V*) 

+ o(l/fc) 



T + ^A T + 13-^)+o{l/k) 



Ink 
~k~ 



sink 
2fc 2 



o(V*) 



s In fc 



1 



1 



Further, H{p) = In fc + (1 - s/fc) lnfc = 21nfc-flnfc. Thus, 



2fc 2 



o(l/fc). 



H(p)+E(p) 



sink f I s \ cs . .. , , „ 

IF U"2P -2^ + ° (1/fc) 



c s ._, ,,. lnfc 



2fc 2 



o(l/fc), 



as claimed. 



Corollary E.12 We have max pe s /(p) 



< 



lnfc 
8fc 



+ 0(l/fc). 



n 



Proof. Lemma IE. 101 implies that the maximum is attained at a matrix p without entries in [0.49,0.51]. 

Therefore, Lemma |K9l shows that the maximizer p has the following form for some integer < s < k and 
some a = 1/fc + (5(l/fc 2 ): 

{I - a if i = j £ [s] , 

■jt^i if i e [s],j =^i, 

1/fc otherwise. 

Thus, for i £ [s] we have 



'H(pi) — h(l — a) + aln(k — I) < h(a) + alnk, 
INI' = (l-a) 2 + a 2 /(fc-l). 
Let p' be the matrix obtained from p by replacing the first s rows by (1, 0, ... , 0). Then 



Hip) -Hip') < y[h(a) + alnk] < — [1-lna + lnfc] < — [1 + Ink] 
k k k 



ii 2 



< 



(l-a) 2 + 



[-2 + q-(1 + l/(fc - 1))] = as [-2 + <3(l/fc)] 



(65) 



(66) 
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Plugging d66l > into d42b yields 

In h O.rvx r "1 

(67) 



E{p)-E( P ') < as [-2 + 0(l/fc)] • (1 + 0(1/*))—^ < In* + 0(1/*) 

fc k 



Combining <|65} and (|67), we obtain f(p) - f(p') < 2|£ 1 + 0(1/*) < 3/fc. Thus, the assertion follows 
from Lemma lE. Ill D 

Corollary E.13 Suppose that p € £ /ias an enfry p,j € [0.49, 0.51]. 77iew /(p) < 0. 
Proof. Let p be a matrix as in Corollary IE. 131 By Lemma lE. lOl and Corollary E.12l we have 

„ , n » , /n In fc In fc „ . , , . In * 

whence the assertion follows. □ 

Lemma E.14 Suppose that p G S has a row i such that max^grw pij £ [0.15, 0.49]. Then f(p) < 0. 

Proof. We may assume that p maximizes / subject to the condition pu = max.,- py e [0.15,0.49]. 
Applying Corollary ES] to J = {2, . . . , fc} (with A = 0.999), we obtain p y = (1 - pn)/(* - 1) for all 
j > 1. Let p be the matrix obtained from p by replacing the first row by ^1. Then 

U{pi) = /i(pn) + (l-p u )ln(*-l) and U{p l ) = \nk. 

Therefore, 

H(p)-H(p) = -i[lnfc-/ l (p 11 )-(l- /0ll )ln(*-l)]<-pi 1 ^+O(l/*). (68) 

Moreover, ||pi|| 2 = p n + (1 — pu) 2 /(k — 1) and ||pi|| 2 = V*, whence 

IMI 2 . - ||p|| 2 < p? 1 + ^y^-iA-<p? 1 . 

Hence, (02| implies £(p) - E(p) < p\ x ^- + (5(l/fc 2 ). Combining this estimate with <|68), we get 

f(p)-f(p) < -Ph(1-Ph)^+0(1/*). 



Since /(p) < ^ + 0(1/*) by Corollary 1+2] we obtain 



/(/>)< 



1 

g -Pii(l-Pn, 



In fc „ , , , . 
— +0(1/*). 



The assertion follows because pn(l — pu) > 1/8 for pu e [0.15, 0.49]. □ 

Finally, Proposition lE.2l follows directly from Coroll arv E.13l and Lemma lE. 141 

E.3 Proof of Proposition IE.3I 

The strategy behind the proofs of Proposition lE.3llE.4l and lE.5l is to bound the maximum of / over s-stable 
p in terms of the values attained at certain peculiar matrices. Combinatorially, these matrices correspond 
to pairs a, r of colorings that completely coincide on s color classes and that are completely uncorrected 
on the remaining fc — s classes. In the following lemma we estimate their function values. 
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Lemma E.15 Let 1 < s < fc — 1. Lef p fee f/ie fc x fc matrix with entries 



Then 

f(p) < 
Proof. We have 

F(p) = 



i </* = i e [s] , 

Pij = { !/( fc - s ) '/*,. 7 G [ fc ] \ H , 
otherwise. 



c /-, /, m /-, ;, x slnfc / s 

- + (1 - .s/fc) ln(l - a/A;) + -^- (3 - - 



f'.S' 

2fc2 



o(l/fc). 



lnfc 



fc- 



ln(fc - s) = 2 In fc + (1 - s/k) ln(l - s/k) - - In fc. 



Moreover, ||jo|| 2 = s + 1. Thus, by (f4TT > 



W = ^ 



d 

2fc2 


[-2* + , + ! 2 (l + ) 


+ o(l/fc) 


d 

2fc 2 


2s s 2 1 

r 2fc+s+1 - 2+ T-2^_ 


+ o(l/fc) 


d 

2fc2 


'- 2 *- 1+ '( 1+ f-2*0" 


+ o(l/fc) 


- 2to * + f + §H-2p) + °w*> 


c s ( d 2d ds \ , , , 
- 21nfc+ fc + 2fc(fc + p-2pj + °( 1 / fc ) 


, , c s ( , , In fc c 2 In fc s In fc 
21nfc+ fc + fc( v ln ^2fc 2fc + fc ' 2fc 2 


-2 In 


c s In fc / 3 s 
k+ k + k l 1+ 2fc 2fc 


J y 


-W + °v 



o(l/fc) 



whence the assertion follows. 



n 



Proof of Proposition \E.3\ Let 1 < s < fc " 9 and let p G R be the s-stable matrix that maximizes / 
Without loss of generality we may assume that pu > 1 — k for i = 1, . . . , s and, due to Proposition IE. 2 
that pij < 0.15 for all other (i, j). Let p be the stochastic matrix with entries 



Pij 



Pij if i e [fc] ,j < s, 

k^T,i >s P*i ifi€[k],j>8. 



Since fc — s ~ fc and max J>s p.y < 0.15, Corollary IE. 81 applied to J = [fc] \ [s] and A = 0.999, implies 
that f(p) < f(p). We are going to compare f(p) with f(p*), where p* is the matrix with entries 

1 if i=je[s], 

Pl, = < l/(k-a) iti,je[k]\[s], 



otherwise. 



Because p is doubly-stochastic we have 



i—l j>s i—1 j>s i>s j—1 i>l j—1 

Let qi = J2i^i Pij — 1 ~~ Pii < « for i < s and $ = X)f=i Pij f° r i > s- By Corollary IA.4 
%(/3i) < h(qi) + qi lnfc < h{n) + k lnfc forie[s]. 



(69) 



(70) 
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Furthermore, for i > s Corollary IA.4I yields 

T-L{f>i) < h(q t ) +q { In s+ (1 - qi) ln(k - s) < h(qi) + qi Ins + ln(k - s). 
Since h is concave and J2i> s Qi — KS ' we obtain from (|69| > 
1 x-- k-s, „ , 1 xr- k 



Y, H (fc) ^ -^ ln ( fc - s ) + 1 Z>(») + Qi Ins) < -^ ln(fc - a) + h (y) + y Ins. (71) 

Combining ( l70t and ( fTTT i. we see that 

1 fe 

< In fc + -r (/i(k) + K In fc) H — ln(fc - s) + h(ns/k) + — In s 

k k k 

k — s 

< ]nk + —r-]n(k-a) + o(l/k) = H(p*) + o(l/k). (72) 

fc 

In addition, we claim that 

\\p\\l < s + 1 + (ks) 2 < s + 1 + fc~ n(1) . (73) 

To see this, notice that 

||/5,||2 <1 fori = !,...,« (74) 



because p is a stochastic matrix. Furthermore, since J2i> s Pij — 1 f° r eacn * *= M \ M> we nave 

E 4 - ( fc - s ) ( %^) 2 < l/(fc - a) for i > s. (75) 



fc- 

J>8 



Finally, as £)*>« X)j<« p-y < KS > we have 

EE^M 2 *<**>*■ (76) 

Combining d74]>— <(76j yields d73l . 

Hence, 03 yields £(p) < £(p*) + o(l/fc). Thus, by C3 and Lemma |ET5l 

/(p) < m<fo>*)+o{i/k) 

c , , , . , . , . . s In fc / s \ cs . , , , 

< _ + (1 _ s/k ) l n( i - , s /fc) + __ ( 3 - -) - — + o(l/fc) 

< r + (1 - s/fc) ln(l - s/k) + o(l/k)<y-r(l- s/k) + o(l/k) 
k k fc 

= /(p)_£(i_ s /fc) + o(i/fe). 

The above function is decreasing in s (for s = o(k)). Thus, it is easily verified that f(p) < f(p) — 1/fc + 
o(l/fc). □ 

E.4 Proof of Proposition IE3I 

Let fc - 999 < s < fc — fc 049 and let p e i? be the s-stable matrix that maximizes /. Without loss of 



generality we may assume that pu > 1 — k for i = 1, . . . , s and, due to Proposition [R2] that p,y < 0.15 
for all other (i,j). 
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Let p be the stochastic matrix with entries 

( Pij if » = j'e [s] , 

Pij = < k=sEi >s Pa if »' G [*] , J G [*] \ W , 

1 ii;E ¥ K 8 Ai ifie[fc],jG[*]\{t}. 

Since max^j pij < 0.15 and s,k — s > k° , we can apply Corollary IE. 81 to Jj = [fc] \ [s] and to 
^ = H \ {*} f or a11 i e [fc] (with A = 0.4). We thus obtain f(p) < f(p). 
Let 

Qj = y] p^ = X] fa for * - s and * = 5Z ^ = 5Z ^ for * > s - 

j>s j>s j<s j<s 

Since p is doubly stochastic, we have 

s 

q = ^2 Qi = X! Qi < ^2 1 - Pa < K s- ( 77 ) 

i>s i<s i—1 

In addition, let 

u = ^ p i j = X! ^ ' - * - pu - K tt - s )- ( ^ 78 - ) 

By Coroll arv I A . 4 1 and the concavity of /i, for alH < s we have 

W(&) < h{U)+tilns + h(qi) + qiln(k-s). 

Again because h is concave, we obtain 

s _. s 

lY, n (M ^ H+^h(q/ S ) + ^ln{k-s),withH = -Y / (HU) + t t \ns). (79) 

£=1 i—1 

Furthermore, again by Corollary IA.41 for i > s we have 

H{pi) < %i) + %lns+(l-%)ln(fc-s). 
Once more by the concavity of h, 

lY. H ^ ^ ^Hv/ik - s)) + | In s + k - S k ~ q ln(fc - s). (80) 

Combining ( 1791 and ( f80b . we see that 

H(p) < lnfe+(|%/s) + |ln(/c-s)) + r^^%/(fc-s)) + |lns 
+ *~f~ g ln(ifc-«)+ff. 
Using the elementary inequality h(z) < z(\ — In z) to simplify the above, we get 
Hip) -H < In k + r [2 + In(a/g) + ln((fc - a)/«) + In s + lnffe - s)] + k ~ s ~ q m ( fc _ s ) 

< In fc + - [2 + 2 Ms) + ln(fc - s) - 2 In 9] H — ln(fc - s) 

k k 

< lnfc+ 3 g( 2 + lnfc ) + *LZ± ln ( fc _ s ) + o(l Ik) [as-zlnz< lforallz>0] 

k k 

= 21nfc + 3g(2 ± ln fc) + (1 - s/k) ln(l - s/k) - ^ + 0(l/fc). (81) 
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Furthermore, we claim that the Frobenius norm satisfies 

s 

\\p\\l <s+l-2j2ti + o(l/lnk). (82) 

To see this, note that by the definition d78l l of i, for i <G [s] we have 

Pu < (1 - U? = 1 - 2ti + 1\ < 1 - 2tj + k 2 . (83) 

Moreover, 

E 4 = (*-i)-(A) 2 ^A<« 2 « e W)- < 84 > 

3GM\{*} V 7 

Similarly, since p is stochastic and pu> 1 — K for i <G [s], we have 

£ 4<(fc-«)-(r^r) 2 <r^-<« 2 (ieM). (85) 



Combining (I83ll-(l851l, we thus obtain 

s s s 

J2\\M\l < 1 + 3^-2^ = l+o fe (l/mfc)-2^. (86) 

i— 1 i— 1 z— 1 

Further, since q = J2i> s 2je[«] Py < KS b Y dZZJ, we have 

EE4 ^ K 2 5 = o fc (l/lnfc); (87) 

this is because the sum of squares on the l.h.s. is maximised in the case that pkj — K for all j € [s], while 
Pij = for all s < i < fc, j <G [s] . In addition, we have 

2>s j>a i>s x 7 x ' 

Finally, combining <f86b— <f88b yields d82b . Using Lemma uTT31 and d42b . we find 

To complete the proof, we need the following little estimate. 

Lemma E.16 We have maxo <z< i h(z) — z In fc < 1/fc. 

Proof. We have h(z) < z(l — In z). It is easily verified that the function z i-> z(l — In z — In fc) takes its 
maximum at z = 1/fc. D 

Lemma lE.16l implies 



H - ^jpX> < r£ft(*() - Mnfc = 0(V*)- 

»=1 i=l 

Combining ([8T]l, d89j> and <[90j, we obtain 



(90) 



= (l-s/fc)ln(l-s/fc) + 0(l/fc)< -y(l-s/k) + d(l/k) <0, 

fc 

as desired. 
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E.5 Proof of Proposition IE.5I 

Let k — \fk < s < k — 1 and let p € R be an s-stable matrix that maximizes /. Without loss of generality 
we may assume that pa > 1 — n for i = 1, . . . , s and, due to Proposition |K2] that ptj < 0.15 for all other 



entries. We thus have 

s s 

q = X] 5Z Pv ' = s ~ X! Pu - KS - ( 9l ^ 

i— 1 j^i i—1 

Furthermore, because p is doubly-stochastic we have 

s s 

i—l j>s i>s j — 1 

Letting £ = x J2t=i h{Pa)i we nave by Corollary IA.4l and the concavity of h 

rl>(ft) < ^ + lh{t/q)+ t -\n{k-a) + ^\n{s), (92) 

1 v^ ^ , / % fc — a , / t \ £, fc — s — t, /7 . 

*£*<*) * — H^j + * ins+ ^— in(fc ^ s) 

k — S , t , k — S — t , ,, 

< — — ln2 + -lns + t ln(fc-s). (93) 

Combining ( |92l and J93l , we find 

iJ(p) < ln/s + S + |lns + ^-— -ln(2(/c-s)) + |/i(i/g) 

< lnfe + E+|lnsH --^ ln(2(fc - s)) + -(1 - In t + In g) 

< ln/s + S + |lnfc+^— -ln(fc-s) + — ^ -O(lnlnfc). (94) 
The last inequality follows because t < k — s, the function t n- — £ lni is upper bounded and q < ks = 

6(1). 

The Frobenius norm can be estimated as follows. Since pu > 1 — n for all i < s, we have 

i:\\p.\\i< i -K 2 ++i:pi<d(i/k)+±pi. 

»=1 i=l i=l 

The middle term arises because for all i <G [s] and all j <G [k] \ {i} we have pij < k (as pu > 1 — k) and 
9 = Z)ig[ a ] Yljjti Pij ^ KS b y <ED- B y a similar token, 



This is because py < 0.15 for all j and ^Zygffe] P*j = !■ Hence, 



0.15 
'j'e[fe] 

||p||2<£/a« + 0.15(fc-s) + d(l/*). 
Thus, if we let id denote the identity matrix, we obtain from (|42| | that 

E{p) E(id) < (1 + 6(1/*))^ ( -0.85(* - s) + 6(1/*) + £>^ - 1)1 ■ (95) 
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Combining d94l i and d95T l and using the inequality h(z) < z(\ — In 2), we find 

/(p)-/(id) < ^^[m(/c-s) + O(mm/c)-0.85m/c] + O(l//c) 
k 

+£ + I lnfcHh (1 + 6(1/^)^^(4-1) 

2—1 

" -^^ + 0{l/k) + T 1 + ^\nk+^Y j {pl-l), (96) 

because k — s < \/^- 

To complete the proof, let m = 1 — p^ for i = 1, . . . , s. A glimpse at d9Tb shows that g = Ylt=i K i- 
Moreover, E = v Sl=i M k j)> as /i(l — z) = /i(z) for z e (0, 1). Since Ki < k = 0(l/k), we have 

E + flnfc+^^^-l) < I^^^^ + ^lnfc+lll-^) 2 -!)!"^ 



; = 1 



1 S 

< — 2^, \h(Ki) + Ki In k + (Ki — 2Ki) In k\ 



=1 



1 s 
< 6(l/fc 2 ) + -^^(K ?; )-K 4 lnA:. 



i=i 



Hence, by Lemma IE. 161 



s + ! lnfc + ¥E^- 1 ) ^ 6(iA 2 ) + ^ = o(i/fc). 



Plugging this bound into j96l ), we find 



/(p)<-^ £ lnfc + O(l/fc) + /(id)<-^lnA : + O(l/A : )<0, 



as claimed. 



F The condenstation transition 

In this final section we prove of Proposition 12. 21 The second moment bound from Proposition l4.3l implies 
together with Paley-Zygmund (|6]l that Z^-coi _: 5E[Zfc_ co i] with a non-vanishing probability for d < 
dk,cond — Ofe(l). Via a concentration result for Zfc_ co i from J2] we can boost this probability to get Zk- C o\ = 
exp(o(n)) • E[Zfc_ co i] w.h.p. Hence, 

ml'zj ~ fc(i - i/fc) d/2 ~ nzk- co i] l/n (97) 

for d < d fe .cond — o fc (l), and thus ip(d) = fc(l - l/fc) d / 2 . 

The more interesting bit is to prove that this formula does not hold for d > dfc.cond + Ofe(l). This is 
by way of the planted model. More precisely, because Corollary 12.21 deals with the total number Zk- C o\ 
of fc-colorings, we need to tweak the experiment P1-P2 from Section |4] slightly: rather than choosing a 
balanced fc-coloring in step PI, just choose er : V — >• [k] uniformly at random. Let (G, er) be the result of 
this tweaked experiment. Using a double-counting argument, one can show the following. 

Lemma F.l (|2]) Assume that \97\ is true at density d. Then for any e > and for any event A we have 

P[GGA]< cxp(-cti) =^> P [G(n, m)eA} = o(l). (98) 
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We are going to exhibit an event A that violates d98l l. To define A, we compare the expected number 
of fc-colorings d97l l with the formula for the cluster size in Lemma l4~4l (which survives our modification of 
the planted model). A little calculation shows that for d > dfc. CO nd + Ofe(l) the typical cluster size exceeds 
the expected number of fc-colorings in G(n, m) by an exponential factor! 

Hence, an obvious candidate seems to be the event A 1 = {Zk- C o\ < n ■ E [Zfc-coi]}, say. However, we 
cannot show that this event is sufficiently unlikely in the planted model. To circumvent this issue, we work 
with the partition function (a similar trick was used in ifTTI for hypergraph 2-coloring): given j3 > and a 
graph G = (V, E), we let 

Zk,p{G) = 2_\ exp (—/3 • |{e e E 1 : e is monochromatic under cr}|) . (99) 

cr:V^>[k] 

Azuma's inequality easily shows that In Zk,p is tightly concentrated. This yields 

Lemma F.2 There is Sk = Ofe(l) such that for d > dfc. con d + £fe there exists numbers C, = C,{k, d) > 0, 
a > 0, j3 > such that In Zk t p (G(n,m)) < (nw.h.p. while P [hi Zk,p{G) < C n ] ^ exp(— an). 

Hence, the event A = {In Zk,p < C n } violates J98l l and thus J97] i does not hold for d > dk,cond + £k- 
This implies that there are two possible scenarios regarding the limit <p(d): either it does not exist for some 
d € {dk.c — Sk, dk, c + Sk), or the limit exists throughout this interval but <p(d) < k(l — l/k) d ' 2 for some 
d E {dk.c - Sk, dfe.c + Sk)- In the latter case, ip(d) cannot be analytic on (d fc , CO nd - s k , dk,cond + £fe) due 
to the uniqueness of analytic continuations (as d n- fc(l — \jk) d l' 2 is analytic on all of C). 
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